MetaCarta Research
Here you can find research papers published by MetaCarta's scientists and technical staff.
2008
Defects in nematic membranes can buckle into pseudospheres John R. Frank and Mehran Kardar Phys. Rev. E 77, 041705 pdf
Parallel creation of gigaword corpora for medium density languages -- an interim report Peter Halacsy, Andras Kornai, Peter Nemeth, Daniel Varga Proc LREC 2008, to appear pdf
Google for the linguist on a budget Andras Kornai and Peter Halacsy Proc 4th Web as Corpus Workshop, LREC 2008, to appear pdf
2007
Mathematical Linguistics Andras Kornai and Springer Verlag. In the series Advanced Information and Knowledge Processing, ISBN 978-1-84628-985-9 book webpage
Probabilistic grammars and languages M. Kracht, G. Penn and E. Stabler (eds) 10th Workshop on the Mathematics of Language preproceedings pdf
HunPos -- an open source trigram tagger (Jointly with P. Halacsy, __, Cs. Oravecz). In S. Ananiadou (ed) Proc. ACL2007 Demo and Poster Sessions 209-212 pdf
Parallel corpora for medium density languages (Jointly with D. Varga, P. Halacsy, __, V. Nagy, L. Nemeth, V. Tron). In N. Nicolov, K. Bontcheva, G. Angelova and R. Mitkov (eds): Recent Advances in Natural Language Processing IV. Selected papers from RANLP-05 John Benjamins, 2007, 247-258 pdf
2006
Partial Metrics and Quantale-valued Sets Michael Bukatin, Ralph Kopperman, Steve Matthews, and Homeira Pajoohesh D.Cenzer et al (eds) CCA 2006: Proceedings of the Third International Conference on Computability and Complexity in Analysis Informatik Berichte 336 (09/2006), FernUniversitaet in Hagen 91-92 pdf
Google Maps Hacks Rich Gibson and Schuyler Erle. O'Reilly, Sebastopol CA book
Using a morphological analyzer in high precision POS tagging of Hungarian Peter Halacsy, Andras Kornai, Csaba Oravecz, Viktor Tron, and Daniel Varga N. Calzolari and K. Choukri (eds) Proc. LREC 2006 2245-2248 pdf
Web-based frequency dictionaries for medium density languages Peter Halacsy, Andras Kornai, Csaba Oravecz, Viktor Tron, and Daniel Varga in A. Kilgariff and M. Baroni (eds) Proc. 2nd Web as Corpus Wkshp (EACL 2006 WS01) 1-8 pdf
2005
Mapping Hacks: Tips & Tools for Electronic Cartography Schuyler Erle, Rich Gibson, Jo Walsh. O'Reilly, Sebastopol CA book
Unlocking Your Nokia Phone Schuyler Erle in Michael Yuan (ed): Nokia Smartphone Hacks. O'Reilly, Sebastopol CA book
Various articles Schuyler Erle in Rob Flickinger and Roger Weeks (eds): Wireless hacks. O'Reilly, Sebastopol CA book
Evaluating geographic information retrieval In C. Peters, F. Gey, J. Gonzalo, H. Mueller, G. Jones, M. Kluck, B. Magnini, and M. de Rijke (eds): Accessing Multilingual Information Repositories. Revised Selected Papers of the Cross-Language Evaluation Forum (CLEF 2005) Springer LNCS 4022, 928-938 pdf
Dependency-based Statistical Machine Translation Heidi Fox, C. Callison-Burch and S. Wan (eds) Proc. 2005 ACL Student Workshop 91-96 pdf
Hunmorph: open source word analysis Peter Halacsy, Andras Kornai, Laszlo Nemeth, Viktor Tron, Gyorgy Gyepesi, and Daniel Varga. In M. Jansche (ed): Proc. ACL 2005 Software Workshop 77-85 pdf
2004
Invasion and Extinction in the Mean Field Approximation for a Spatial Host-Pathogen Model M.A.M. de Aguiar, E. M. Rauch, and Y. Bar-Yam. Journal of Statistical Physics 114: 1417-1451 (2004). pdf
Creating open language resources for Hungarian (Jointly with P. Halacsy, __, L. Nemeth, A. Rung, I. Szakadat, V. Tron). In Proc. LREC 2004 203-210 pdf
Leveraging the open source ispell codebase for minority language analysis (Jointly with L. Nemeth, V. Tron, P. Halacsy, __, A. Rung, I. Szakadat). In J. Carson-Berndsen (ed): Proc. SALTMIL 2004 56-59 pdf
2003
Mean-field approximation to a spatial host-pathogen model M.A.M. de Aguiar, E. M. Rauch, and Y. Bar-Yam. Physical Review E 67: 047102 (2003). pdf
Fractal geometry of critical Potts clusters J. Asikainen, A. Aharony, B.B. Mandelbrot, E.M. Rauch, and J.-P. Hovi European Physical Journal B 34: 479 (2003). pdf
Building a high performance gazetteer database Amittai Axelrod Proc. HLT-NAACL WS9 pdf
Proceedings of the HLT-NAACL Workshop on the Analysis of Geographic References Andras Kornai and Beth Sundheim (eds): Association for Computational Linguistics, 2003, ISBN 1-932432-04-3 (WS9), paperbound, vi+81 pages.
Classifying the Hungarian web (Jointly with __, M. Krellenstein, M. Mulligan, D. Twomey, F. Veress, A. Wysoker) In A. Copestake and J. Hajic (eds): Proc. EACL 2003 203-210 pdf
Explicit finitism International Journal of Theoretical Physics 2003/2 301-307 pdf
Mathematical Linguistics (Jointly with G.K. Pullum, __) In W. Frawley (ed): Oxford International Encyclopedia of Linguistics, Oxford University Press 2003, v3 17-20 pdf
Optical Character Recognition In W. Frawley (ed): Oxford International Encyclopedia of Linguistics, Oxford University Press 2003, v3 33-34 pdf
A confidence-based framework for disambiguating geographic terms Erik Rauch, Michael Bukatin, and Kenneth Baker Proc. HLT-NAACL WS9 pdf
Discrete, Amorphous Physical Models Erik M. Rauch International Journal of Theoretical Physics 42: 329-348 (2003)
Dynamics and genealogy of strains in spatially extended host-pathogen models Erik M. Rauch, Hiroki Sayama and Yaneer Bar-Yam Journal of Theoretical Biology 221: 655-664 (2003). pdf
2002
Stability and Instability of Polymorphic Populations and the Role of Multiple Breeding Seasons in Phase III of Wright's Shifting Balance Theory M. A.M. de Aguiar, H. Sayama, E. M. Rauch, Y. Bar-Yam, and M. Baranger. Physical Review E 65: 031909 (2002). pdf
Logic of Fixed Points and Scott Topology Michael A. Bukatin Topology Proceedings vol. 26, 2002, pp. 433-468 pdf
Mathematics of Domains PhD Thesis Michael A. Bukatin, Department of Computer Science, Brandeis University, February 2002. pdf
Phrasal Cohesion and Statistical Machine Translation In Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP 2002) Heidi J. Fox pp. 304-311 pdf
How many words are there? Glottometrics 2002/4 61-86 pdf
Linear Discriminant Text Classification in High Dimension. (Jointly with __, J.M. Richards) In A. Abraham and M. Koeppen (eds): Hybrid Information Systems Physica Verlag, Heidelberg, 2002 527-538 pdf
1998 - 2001
Zipf's law outside the middle range Proc. Sixth Meeting on Mathematics of Language University of Central Florida, 1999 347-356 pdf
A Robust, Language-Independent OCR System. (Jointly with Z. Lu, I. Bazzi, __, J. Makhoul, P. Natarajan, R. Schwartz) In: Robert J. Mericsko (ed): Proc. 27th AIPR Workshop: Advances in Computer-Assisted Recognition SPIE Proceedings 3584 1999 pdf
Quantitative Comparison of Languages. Grammars 1998/2 155-165 pdf
Conferences, Workshops
The big meetings of the information retrieval community are TREC, SIGIR, and CLEF.
The most important natural language understanding/computational linguistics meetings are organized by the Association for Computational Linguistics, with various annual and bi-annual meetings (the European ACL, the North American ACL, and the Applied ACL conferences are the most significant) all around the world. COLING is sometimes held jointly with the ACL.
For natural language data the key venue is the biannual LREC, with good multilungual material at AMTA and RIAO as well.
The big pattern recognition and machine learning conferences ICPR, ICASSP, ICML also tend to carry relevant material.
GIS people have their own big meetings, such as CALGIS
Good summer schools include ESSLLI and LSA for linguistics. See the new ML blog for machine learning.
Tech Reports and Hosted Material
ECAI 1998 Workshop on Extended Finite State Models of Language
HLT-NAACL 2003 Workshop on the Analysis of Geographic References
Mathematical Linguistics