EarthPT
700 million parameters
14 billion tokens
A Foundation Model for Earth Observation
We are proud to announce EarthPT: Aspia Space’s latest work developing a foundation model for Earth Observation (EO).
EarthPT is a 700 million parameter “Large Observation Model” (LOM) trained on over 14 billion tokens of EO data, leveraging our unique ClearSky cloud-free spatio-temporal dataset.
Villalobos et al. (2022) have predicted that we will run out of high-quality data for the training of Large Language Models (LLMs) by 2026, and therefore new sources will be required to continue scaling and improving large neural models. Remote sensing data is such a source. By harnessing the power of this rich, diverse and continually evolving data via LOMs, Aspia Space is creating the next generation of data-driven EO solutions, tackling challenges in global food security, productivity and sustainability, land-use management and environmental monitoring.
We have found that EarthPT follows the usual neural scaling laws (bigger data is better) just like a traditional LLM, and so LOMs like EarthPT will be able to circumvent the LLM data scaling problem that will throttle traditional language-based models. However, there are interesting parallels between LLMs and LOMs. Just as LLMs can predict the most likely sequence of words, EarthPT can predict the most likely sequence of remote sensing variables. It allows us to see into the future - allowing us to pre-empt and react to events before they happen.
And this is just the start - EarthPT is currently trained on just a small subset of our overall ClearSky dataset. We are now scaling up to incorporate orders of magnitude more tokens and including multi-modal data to create transformative new products for our customers.