Annual folder structure of ready-made datasets

Research Services will introduce a annual folder structure in some of its ready-made datasets during summer 2026. The purpose of these annual folders is to make it easier to grant access rights in FIONA for specific years only. The change affects not only the structure of some datasets but also the location of all ready-made datasets in FIONA, meaning that researchers need to take this into account in their analysis codes.

Background

Statistics Finland has strengthened the protection of unit-level datasets in recent years. As one measure, the criteria for granting access to full ready-made datasets were tightened, and extracts of ready-made datasets limited by target population and years have been preferred. Previously, access was often granted to an entire ready-made dataset covering all years, which was not optimal from the perspective of the data minimisation principle.

However, limiting ready-made datasets by target population or by year has had to be carried out manually for each project separately. This has increased the workload of Research Services and raised the number of working hours billed to research projects.

For this reason, Research Services has decided to divide ready-made datasets into annual entities in advance, in order to reduce the additional work caused by year-based restrictions and, consequently, lower costs for research projects.

In connection with the introduction of the annual folder structure, a database reform was also implemented in FIONA administration to ensure a more stable foundation for data distribution in the future.

Which ready-made datasets are organised into year folders?

When planning the annual folder structure, it was observed that not all ready-made datasets provided by Research Services are suitable for being divided into year-specific entities, as some datasets contain period variables defined in different ways. Some of these ready-made datasets may include a year variable, but the variable does not contain information relevant for limiting research years, instead referring, for example, to the year in which the data was received by Statistics Finland. Variables related to year or time points may also contain a significant amount of missing information.

Ready-made datasets that are not divided into year folders are available in the same way as before: they can be made available in full to a research project if required, or, if they need to be limited for research purposes, the restriction is carried out as chargeable work and billed to the research project. Two hours of such work are included in the price of a ready-made dataset.

Information on which ready-made datasets are organised into year folders and which are not will be available in the new updating schedule for ready-made datasets.

How does the introduction of year folders affect researchers?

All ready-made datasets will be moved to a new location in FIONA. User access rights to the folders will be updated to correspond to the user licence; for example, projects that have had read access to a continuously updated full ready-made dataset will be granted read access to all annual folders of that dataset. Read access will then also extend to future annual folders of the dataset until the user licence expires.

The new location in FIONA is called annual, and both datasets organised into year folders and those that are not will be moved there. All new data updates will be delivered to the new annual location as of 1 June 2026, and will no longer be delivered to the ready-made/continuous location in FIONA.

Researchers must take the new data location into account in their data integration and analysis code. References to dataset paths must be updated from the previous ready-made/continuous location to the annual location. In addition, datasets organised into year folders must be combined separately into time series datasets.

In connection with the introduction of the annual folder structure, file and variable names will also be converted to lowercase. These changes must also be taken into account in code, especially in case-sensitive programs.

Transition period

The new annual data location will be introduced during the first weeks of June 2026. The new annual location and the old ready-made/continuous location will coexist for the next few months. During this period, researchers are expected to start using the new location and update their code as necessary. The old ready-made/continuous location will be decommissioned in the autumn. The exact timing will be announced later.