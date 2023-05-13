AI2 (Allen Institute for AI Research) is working on the development of a new language model called the Open Language Model (OLMo), which is being created in collaboration with AMD, the Large Unified Modern Infrastructure consortium, Surge AI and MosaicML. OLMo aims to bridge the gap between public and private research capacities in the field of language models. Unlike other existing models, OLMo will be open source, allowing researchers to access and improve its components.

You will also focus on leveraging and better understanding academic textbooks and articles, making you especially well-suited for scientific and academic applications.

An open language model for scientific advancement

AI2 recognizes the need for open language models to drive scientific and technological advancement. OLMo is presented as a complete platform rather than simply a model, which means that the research community can use each component developed by AI2 and seek to improve it. Everything built for OLMo will be openly available, including a public demo, training dataset, API, and documentation. This openness encourages collaboration and the ultimate goal is to build the best open language model in the world in a collaborative way.

Focus on understanding scientific and academic texts

One of OLMo’s distinctive features is its focus on understanding and use of textbooks and academic articles. Although there have been previous attempts, such as Meta’s Galactica model, AI2 relies on its experience in academia and tools developed for research, such as Semantic Scholar, to make OLMo particularly suitable for scientific and academic applications. The objective is that OLMo can better analyze and understand the information contained in these specialized texts, which will allow significant advances in scientific research.

Ethical and legal considerations

Given the potential for misuse of generative AI models, AI2 is aware of the ethical and legal challenges surrounding OLMo. To address these issues, the OLMo team will work closely with AI2’s legal department and will seek advice from external experts. At different stages of the model development process, ethical and intellectual property rights assessments will be conducted. AI2 is committed to promoting an open and transparent dialogue about the model and its intended use, to understanding how to mitigate issues such as bias and toxicity, and to highlighting outstanding research questions in the scientific community.

Collaborative contributions and critiques

AI2 invite external collaborators to contribute and provide constructive criticism during the OLMo model development process. This demonstrates AI2’s willingness to receive input from various experts and the community at large. The participation of external collaborators will enrich the model and help to identify possible improvements and additional development areas. Those interested in participating can contact the organizers of the OLMo project.

What does the OLMo project represent for science?

AI2’s OLMo project represents a significant step towards the democratization of language models and scientific advancement. By providing a comprehensive and open language model, AI2 seeks to bridge the gap between public and private research, fostering collaboration and allowing the scientific community to work directly on improving the model.

OLMo’s focus on understanding scientific and academic texts is especially promising. By enabling the model to understand and take advantage of the vast amount of knowledge contained in academic books and articles, the door is opened for significant advances in scientific research and the development of more effective and safer technologies.

However, it is also important to address the ethical and legal challenges associated with generative AI models. AI2 is taking proactive steps by working with legal experts and establishing an ethical review committee. Transparency and openness in dialogue with the community are critical to understanding and mitigating potential risks and ensuring that the model is used responsibly.

