Árvores de Decisão na Classificação de Dados Astronômicos
DOI:
https://doi.org/10.5540/tema.2009.010.01.0075Abstract
Os registros de astronomia ótica constituem uma fonte de informação extremamente importante. Estas medidas são fundamentais para classificar estrelas e galáxias. Este trabalho descreve o algoritmo de construção de árvore de decisão (J4.8) e sua aplicação na construção de classificadores baseados em atributos fotométricos para classificar objetos astronômicos em estrelas e galáxias. Dados do projeto Sloan Digital Sky Survey (SDSS) foram utilizados para treinamento e validação dos classificadores desenvolvidos. Os classificadores apresentaram índices de acerto, sobre o conjunto de teste, superiores a 98% para a classificação de estrelas e superiores a 99% para a classificação de galáxias.References
[1] J. Adelman-Mccarthy et al., The sixth data release of the sloan digital sky survey. The Astrophysical Journal Supplement Series, 175, No. 2 (2008), 297–313.
N. M. Ball et al., Galaxy types in the Sloan Digital Sky Survey using supervised artificial neural networks, Monthly Notices of the Royal Astronomical Society, 348 (2004), 1038–1046.
N. M. Ball, R. J. Brunner, A. D. Myers, Robust machine learning applied to astronomical datasets I: star-galaxy classification of the sloan digital sky survey DR3 using decision trees. The Astrophysical Journal, 650 (2006), 497–509.
D. Bazell, D. W. Aha, Ensembles of classifiers for morphological galaxy classification, The Astrophysical Journal, 548 (2001), 219–223.
L. Breiman, Random forests, Machine Learning, 45, No. 1 (2001), 5–32.
L. Breiman, J.H. Friedman, R.A. Olshen, C.J. Stone, “Classification and regression trees”, U.S.A: Wadsworth Publishing Company, 1984.
R.R Carvalho, H.V. Capelato, H.F. Campos Velho, Um universo escuro na era da tecnologia da informação, Boletim da Sociedade Brasileira de Astronomia -(submetido).
F. Cortiglione, P. Mahonen, P. Hakala, T. Franti, Automated Star-Galaxy discrimination for large surveys, The Astrophysical Journal, 556 (2001), 937–943.
Y. Freud, L. Mason, The alternating decision tree learning algorithm, Proceedings of the Sixteenth International Conference on Machine Learning, (1999), 124–133.
J.P. Huchra, M.J. Geller, Groups of galaxies I. Nearby groups, The Astrophysical Journal, 257 (1982), 423–437.
E.B. Hunt, J. Marin, P.J. Stone, “Experiments in Induction”. New York: Academic Press, 1966.
R. Kohavi, Scaling up the accuracy of naive - Bayes classifiers: a decision tree hybrid. Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, AAAI Press, (1996), 202–207.
N. Lin, A.R. Thakar, CasJobs and MyDB: A batch query workbench, Computing in Science and Engineering , 10, No. 1 (2008), 18–29.
M.S. Madsen, “The Dynamic Cosmos - Exploring the Physical Evolution of the Universe”, New York, NY, USA: Chapman e Hall, 1996.
A.S. Miller, M.J. Coe, Star/galaxy classification using Kohonen self-organizing maps. Monthly Notices of the Royal Astronomical Society,279,(1996), 293–300.
K.S. Murty, S. Kasif, S. Salzberg, A system for induction of oblique decision tree, Journal of Artificial Intelligence Research, 2, (1994), 1–32.
V. Petrosian, Surface brightness and evolution of galaxies, The Astrophysical Journal, 209, No. 1 (1976).
J.R. Quinlan, “C4.5: Programs for Machine Learning”. San Mateo, CA: Morgan Kaufman, 1993.
J.R. Quinlan, Induction of decision trees. Machine Learning, 1, No. 1 (1986), 81–106.
S. Salzberg et al., Decision trees for automated identification of cosmic ray hits in hubble space telescope images,Publications of the Astronomical Society of the
Pacific, 107 (1995), 1–10.
C. Stoughton, R.H. Lupton, M. Bernardi, M.R. Blanton, Sloan Digital Sky Survey: early data release. The Astrophysical Journal, 123, (2002), 485–548.
A. Suchkov, R.J. Hanisch, B. Margon, A Census of object types and redshift estimates in the SDSS photometric catalog from a trained decision tree classifier, The Astronomical Journal, 130, (2005), 2439–2452.
A.S. Szalay, A.R. Thakar, J. Gray, The sqlLoader data-loading pipeline, Computing in Science and Engineering, 10, No. 1 (2008), 38–48.
A.R Thakar, A.S. Szalay, G. Fekete, J. Gray, The catalog archive server database management system. Computing in Science and Engineering, 10, No. 1 (2008), 30–37.
I.H. Witten, E. Frank, “Data mining: Practical Machine Learning Tools and Techniques with JAVA Implementations”. San Francisco: Morgan Kaufmann, 2000.
Y. Zhang, Y. Zhao, A comparison of BBN, ADTree and MLP in separating quasars from large survey catalogues, Chinese Journal of Astronomy and Astrophysics, 7, No. 2 (2007), 289–296.
Y. Zhao, Y. Zhang, Comparison of decision tree methods for finding active objects, Advances in Space Research, 41, No. 1 (2008), 1955–1959.
Downloads
Published
How to Cite
Issue
Section
License
Authors who publish in this journal agree to the following terms:
Authors retain copyright and grant the journal the right of first publication, with the work simultaneously licensed under the Creative Commons Attribution License that allows the sharing of the work with acknowledgment of authorship and initial publication in this journal.
Authors are authorized to assume additional contracts separately, for non-exclusive distribution of the version of the work published in this journal (eg, publish in an institutional repository or as a book chapter), with acknowledgment of authorship and initial publication in this journal.
Authors are allowed and encouraged to publish and distribute their work online (eg, in institutional repositories or on their personal page) at any point before or during the editorial process, as this can generate productive changes as well as increase impact and the citation of the published work (See The effect of open access).
This is an open access journal which means that all content is freely available without charge to the user or his/her institution. Users are allowed to read, download, copy, distribute, print, search, or link to the full texts of the articles, or use them for any other lawful purpose, without asking prior permission from the publisher or the
author. This is in accordance with the BOAI definition of open access
Intellectual Property
All the contents of this journal, except where otherwise noted, is licensed under a Creative Commons Attribution License under attribution BY.