Data and their delivery modes

Statistics Finland’s extensive data files allow diversified research use of data. The Research Services offer researchers access to ready-made and tailored data materials. Data are available from both enterprise and individual databases in various ways.

Unit-level data can be used via the remote access system and at the Research Laboratory. Certain data can also be released to the applicant's organisation.

Data

Ready-made research data

The Research Services have compiled ready-made research data that offer a versatile information basis for studying Finnish business activity and the characteristics and development of the population. The ready-made data contain a certain set of variables connected to the theme of the data module. The ready-made data modules and the variables they contain are presented in the Taika research data catalogue.
Research project-specific extractions will primarily be made from ready-made datasets containing personal data. The licence applicant specifies the target population and the years for which the extraction is made on the application form. For valid reasons, the project may have a ready-made dataset as total data at its disposal. 

The available datasets include:

  • EDUC education data modules 
  • FIRM enterprise data modules 
  • FOLK personal data modules
  • SES wage and salary data modules
  • TAX income data modules

Data from other authorising bodies, such as Finnish Customs, Traficom and the Matriculation Examination Board, are also available as ready-made datasets. Descriptions of unit-level data intended for research use can be found in the Taika research data catalogue. Please note that all the data descriptions are not available in English. If possible, see the descriptions in Finnish.

The data can be combined uniformly with secured unit identifiers and they are updated regularly. In addition, tailored data can be combined to ready-made datasets from Statistics Finland's data warehouse according to researcher needs. Ready-made datasets are used through the FIONA remote access system.

Read more about ready-made datasets in the Taika research data catalogue

Ready-made data subject to a one-off fee    

Ready-made data subject to a one-off fee means that If the project has no need for updates, the material can be obtained for a one-off fee. Ready-made data subject to a one-off fee are charged after delivery. During this period, data can be copied to the project workspace (W disk) in FIONA. The copied data will be available to the project during the duration of the licence.

Ready-made data with continuous delivery

A continuously delivered ready-made data module means that by paying an annual fee, the research project can have access to ready-made modules and their future updates as long as the licence is valid.

The project that is using ready-made data subject to a one-off fee can apply for the continuously delivered modules according to the normal application process.

Projects which have been granted continuous ready-made personal datasets as total data before 1 June 2024 can continue to use the datasets during the validity of the user licence if no changes to the user licence are applied for. If applying for changes to the validity of a user licence or the dataset, the reason for the need for continuous ready-made datasets as total data must be given on the application form or to the processor in the processing stage. If only a change to the validity of a user licence is applied for, the application is processed as a dataset licence. If there is no need for total data, updating of total data is discontinued at the end of the calendar year. In this case, the project will have time until the end of the calendar year to copy the data needed to Fiona’s W drive.

Extraction from ready-made data means that only rows concerning to the research target population are extracted from the ready-made data. The target population is defined in connection with the user licence application. A user licence can be granted in advance for subsequent extraction years, allowing the researcher to request repetition of the target population extraction, i.e., an updated extraction, within the same user licence when new statistical years become available. Updated extractions are subject to a fee.

Tailored data

Statistics Finland’s statistical data offer extensive possibilities for the forming of various tailored research data according to research need. For example, data from statistical production are available on the population, employment, household-dwelling units, education, justice and enterprises. Information about the data of the statistics can be found on the web pages of each set of statistics. See statistics by topic.

Tailored data can be used in the FIONA remote access system and at the Research Laboratory. Data on age, sex, education, occupation and socio-economic group may be released to the applicant’s organisation with identification data if the applicant is entitled to collect such data by virtue of the General Data Protection Regulation. An account of this must be included in the application. An additional requirement is that the release of the data in identifiable form is necessary with regard to the study.

Cause of death data

Cause of death data can be released in table format or as copies of death certificates. More detailed instructions on the release of cause of death data are available on the home page of the archive of death certificates. The processing of user licences and data delivery are done in the Research Services. A description of the basic data on causes of death and other available data can be found in the Taika research data catalogue.

Findata processes licence applications that concern combining cause of death data with data from more than one of the register controllers mentioned in the Act on the Secondary Use of Health and Social Data. Read more from Findata’s pages. Direct the application to Findata:

Service data files or interview data related to population and living conditions

The most essential interview data describing the population and living conditions have been formed into service data files which Statistics Finland offers for research use. Further information about these service data files is available on our website from the pages dedicated to each of these surveys.

Data delivery modes

Unit-level data can be used via the remote access system and at the Research Laboratory. Service data files may also be released for research use. In addition, data on causes of death can be obtained and the following variables can be obtained when released to one's own organisation: age, sex, education, occupation, socio-economic group.

Remote access use 

Statistics Finland’s FIONA remote access system is a data secure processing environment for research data for the unit-level data subject to user licence needed in research. Through the FIONA system researchers can use Statistics Finland’s data via secure remote connection. Data by other authorities and researchers’ own data, as well as data that are not protected against indirect identification can also be available for remote access use.

In the remote access system researchers have their own project-specific saving space to store the research data, analysis results and codes according to the user licence. Research results and other material may only be transferred outside the system through a screening process.The data are processed in Fiona pseudonymised, which means that unique identifiers are protected with pseudo identifiers.

Use at the Research Laboratory

At Statistics Finland’s Research Laboratory researchers can handle protected data according to user licences on a workstation reserved for them through the FIONA remote access system. Data not protected against indirect identification can also be processed at the Research Laboratory. By renting a research laboratory site, researchers get access to protected data according to the user licence through the FIONA system and the needed IT and support. The Research Laboratory is located in the premises of the Library of Statistics in Helsinki.

The data are mainly in SAS format. In addition, analyses can be made by using the Stata, SPSS and R software, for example. Research results and other material may only be transferred outside the Research Laboratory through a screening process. The use of the Research Laboratory is billed in accordance with the price list of the Research Services.

Release of data to researchers

It is possible to release to an organisation cause of death data (data derived from the death certificate) or data on age, sex, education, occupation and socio-economic group. The condition for releasing data is that releasing of data in identifiable form is necessary for the research and a data protection description concerning the data must be submitted to Statistics Finland for viewing (account/description of the processing of the data).

Service data files in which the possibility for indirect identification has been removed can also be released to an organisation.