Árvores de Decisão na Classificação de Dados Astronômicos
Os registros de astronomia ótica constituem uma fonte de informação extremamente importante. Estas medidas são fundamentais para classificar estrelas e galáxias. Este trabalho descreve o algoritmo de construção de árvore de decisão (J4.8) e sua aplicação na construção de classificadores baseados em atributos fotométricos para classificar objetos astronômicos em estrelas e galáxias. Dados do projeto Sloan Digital Sky Survey (SDSS) foram utilizados para treinamento e validação dos classificadores desenvolvidos. Os classificadores apresentaram índices de acerto, sobre o conjunto de teste, superiores a 98% para a classificação de estrelas e superiores a 99% para a classificação de galáxias.References
[1] J. Adelman-Mccarthy et al., The sixth data release of the sloan digital sky survey. The Astrophysical Journal Supplement Series, 175, No. 2 (2008), 297–313.
N. M. Ball et al., Galaxy types in the Sloan Digital Sky Survey using supervised artificial neural networks, Monthly Notices of the Royal Astronomical Society, 348 (2004), 1038–1046.
N. M. Ball, R. J. Brunner, A. D. Myers, Robust machine learning applied to astronomical datasets I: star-galaxy classification of the sloan digital sky survey DR3 using decision trees. The Astrophysical Journal, 650 (2006), 497–509.
D. Bazell, D. W. Aha, Ensembles of classifiers for morphological galaxy classification, The Astrophysical Journal, 548 (2001), 219–223.
L. Breiman, Random forests, Machine Learning, 45, No. 1 (2001), 5–32.
L. Breiman, J.H. Friedman, R.A. Olshen, C.J. Stone, “Classification and regression trees”, U.S.A: Wadsworth Publishing Company, 1984.
R.R Carvalho, H.V. Capelato, H.F. Campos Velho, Um universo escuro na era da tecnologia da informação, Boletim da Sociedade Brasileira de Astronomia -(submetido).
F. Cortiglione, P. Mahonen, P. Hakala, T. Franti, Automated Star-Galaxy discrimination for large surveys, The Astrophysical Journal, 556 (2001), 937–943.
Y. Freud, L. Mason, The alternating decision tree learning algorithm, Proceedings of the Sixteenth International Conference on Machine Learning, (1999), 124–133.
J.P. Huchra, M.J. Geller, Groups of galaxies I. Nearby groups, The Astrophysical Journal, 257 (1982), 423–437.
E.B. Hunt, J. Marin, P.J. Stone, “Experiments in Induction”. New York: Academic Press, 1966.
R. Kohavi, Scaling up the accuracy of naive - Bayes classifiers: a decision tree hybrid. Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, AAAI Press, (1996), 202–207.
N. Lin, A.R. Thakar, CasJobs and MyDB: A batch query workbench, Computing in Science and Engineering , 10, No. 1 (2008), 18–29.
M.S. Madsen, “The Dynamic Cosmos - Exploring the Physical Evolution of the Universe”, New York, NY, USA: Chapman e Hall, 1996.
A.S. Miller, M.J. Coe, Star/galaxy classification using Kohonen self-organizing maps. Monthly Notices of the Royal Astronomical Society,279,(1996), 293–300.
K.S. Murty, S. Kasif, S. Salzberg, A system for induction of oblique decision tree, Journal of Artificial Intelligence Research, 2, (1994), 1–32.
V. Petrosian, Surface brightness and evolution of galaxies, The Astrophysical Journal, 209, No. 1 (1976).
J.R. Quinlan, “C4.5: Programs for Machine Learning”. San Mateo, CA: Morgan Kaufman, 1993.
J.R. Quinlan, Induction of decision trees. Machine Learning, 1, No. 1 (1986), 81–106.
S. Salzberg et al., Decision trees for automated identification of cosmic ray hits in hubble space telescope images,Publications of the Astronomical Society of the
Pacific, 107 (1995), 1–10.
C. Stoughton, R.H. Lupton, M. Bernardi, M.R. Blanton, Sloan Digital Sky Survey: early data release. The Astrophysical Journal, 123, (2002), 485–548.
A. Suchkov, R.J. Hanisch, B. Margon, A Census of object types and redshift estimates in the SDSS photometric catalog from a trained decision tree classifier, The Astronomical Journal, 130, (2005), 2439–2452.
A.S. Szalay, A.R. Thakar, J. Gray, The sqlLoader data-loading pipeline, Computing in Science and Engineering, 10, No. 1 (2008), 38–48.
A.R Thakar, A.S. Szalay, G. Fekete, J. Gray, The catalog archive server database management system. Computing in Science and Engineering, 10, No. 1 (2008), 30–37.
I.H. Witten, E. Frank, “Data mining: Practical Machine Learning Tools and Techniques with JAVA Implementations”. San Francisco: Morgan Kaufmann, 2000.
Y. Zhang, Y. Zhao, A comparison of BBN, ADTree and MLP in separating quasars from large survey catalogues, Chinese Journal of Astronomy and Astrophysics, 7, No. 2 (2007), 289–296.
Y. Zhao, Y. Zhang, Comparison of decision tree methods for finding active objects, Advances in Space Research, 41, No. 1 (2008), 1955–1959.
How to Cite
Authors of articles published in the journal Trends in Computational and Applied Mathematics retain the copyright of their work. The journal uses Creative Commons Attribution (CC-BY) in published articles. The authors grant the TCAM journal the right to first publish the article.
Intellectual Property and Terms of Use
The content of the articles is the exclusive responsibility of the authors. The journal uses Creative Commons Attribution (CC-BY) in published articles. This license allows published articles to be reused without permission for any purpose as long as the original work is correctly cited.
The journal encourages Authors to self-archive their accepted manuscripts, publishing them on personal blogs, institutional repositories, and social media, as long as the full citation is included in the journal's website version.