State Data Agency (Statistics Lithuania) Launches a New Training Season


This September, the State Data Agency welcomes the start of an updated training program. The second season of the State Data Lake training is being opened. These trainings are intended for public sector representatives who work with data and make decisions based on analysis. The goal of the training is to provide practical knowledge on how to effectively use modern tools, process data, and make decisions based on reliable facts.
Since 2024, more than 2,000 participants from various institutions have already joined trainings prepared by the experts of the State Data Agency.
This month focuses on education, data integration, ontology development, visualization design, and municipal analytics. Below we present detailed information on all September sessions.
Introduction to the Data Lake
This month’s introductory session will analyze data from the Lithuanian education sector. Participants will get acquainted with:
- Information of the registers of the National Agency for Education (NŠA): data on pupils, students, and teachers, study programmes, diplomas and certificates.
- Non-formal education and school administration data: the Register of Non-Formal Education Programs (NŠPR), the Register of Education, Science and Sports Institutions (ŠMIR), the Education Management Information System (ŠVIS), the National Education Management Information System (NEMIS).
- Admission processes (the Lithuanian Higher Institutions Association for Organizing Joint Admission (LAMA BPO), data from the KURSUOK! program.
- Attendance information from school e-diaries, kindergarten and school queue and application system.
- Social and economic education indicators: salaries, age structure of teachers, number of employees (data from “SoDra”).
- Additional sources: the Lithuanian Integral Information System of Libraries (LIBIS) textbooks, data from the Culture Pass, information on Olympiad participants (the Olympiad Winners Database (ONDB) and career guidance (the Pupils’ Career Education Information System (MUKIS).
This is an excellent opportunity to get acquainted with the broad education data ecosystem, which is useful both for strategic planning and for everyday analytics.
Pipeline Builder – a data integration and transformation tool
Pipeline Builder is a tool with a graphical interface designed to create data transformation flows without programming knowledge. It is particularly useful for:
- standard data processing tasks (filtering, aggregation, field selection, type conversion),
- preparing geographic data (spatial joins, buffer analysis, layer creation),
- preparing data for visualization in other tools, such as Contour or Workshop.
During the training, participants will not only get acquainted with the tool’s features but also gain practical experience in creating data flows that they will be able to apply in their daily work.
Ontology and Workshop – semantic data structures and their application
The September session is intended for more advanced Data Lake users who want not only to view or transform data but also to create their semantic structure and turn it into practical solutions for everyday use. The Ontology tool makes it possible to define which objects and their relationships exist in the data: from schools and teachers to their actions, attributes, and interactions. In this way, a clear semantic schema is created, allowing users to find information faster, analyze it, and present it to others.
When data is structured according to the ontology principle, it becomes easy not only to search for specific records but also to analyze the entire context. For example, it is possible to understand which factors influence teacher turnover, in which municipalities specialists in certain fields are most lacking, or which object relationships have the greatest impact on results.
The Workshop tool complements the Ontology and enables practical use of this data. In a single environment, users can not only view information but also create tools for everyday work: from dashboards to data collection forms.
During trainings, the following topics will be covered:
- how to prepare data for use in the Ontology,
- how to create object types, their attributes, and describe relationships between objects,
- how to configure actions and apply them to everyday scenarios,
- how to build a data collection or analysis application.
Benefits the Ontology and the Workshop bring to organizations:
- enable the creation of a common data “dictionary” and reduce interpretation errors,
- automate data flows and reduce manual work,
- allow faster response to changes and access to real-time information,
- improve data management processes and ensure clear visibility of data lineage.
Such training is particularly valuable for public sector institutions that aim to solve data fragmentation issues, standardize terminology, and develop interactive tools for data-driven decision-making.
Contour – analysis and visualizations
Contour is the most user-friendly Data Lake tool, which enables users to:
- view and analyze data,
- perform filtering, grouping, and joins,
- create simple reports and dashboards.
During the September session, we will carry out practical exercises together:
- analyzing Soviet-era street names based on the Address Register data,
- identifying the trajectory of a hostile drone,
- visualizing public transport GPS data in major cities.
Additional Data Lake functionalities will also be introduced: Data Lineage, Fusion, Forms.
Municipality DataLab – specialized analytical tools for municipalities
The Municipality DataLab is an interactive analytical environment designed specifically for municipal administrations. It integrates various national data sources, and presents them in a way that allows municipal specialists to quickly answer pressing questions, plan resources, and make informed decisions.
Currently, the application covers the following topics:
- Education – number and distribution of pupils and teachers, analysis of teachers’ age structure, class sizes, attendance.
- Evacuation of vulnerable groups – data on shelters, collective protection buildings, and evacuation scenario modeling.
- Infrastructure – information on roads, buildings, and engineering networks to support investment planning and maintenance.
September’s focus – “Aging of Teachers.” During the training, participants will learn how to:
- identify trends in age structure,
- compare their municipality with national indicators,
- model future forecasts for teacher demand.
Why participate in Data Lake training?
Data Lake training provides the opportunity to gain practical skills, as it focuses on solving real tasks and performing data analysis. Participants are introduced to open data sources, registries, and possibilities for their integration, allowing them to apply their knowledge in daily work. All tools are designed to be user-friendly, even for those without programming skills, making the training accessible to a wide range of professionals. Moreover, these sessions bring together a professional community: participants can discuss, share experiences, and establish new collaborations.
In the coming months, planned topics include health, population demographics, transport, environmental protection, economics, infrastructure, and other areas relevant to the public sector.
More information on the training can be found here.
