Which datasets can be combined?

Statistics Finland's research datasets can be combined with both Statistics Finland's datasets and other datasets. The combining is usually made by Statistics Finland.

The possibilities for combining depend on the dataset to be combined.

  1. Datasets collected with an interview or a survey: combining is possible only if the subjects are informed of the combining of the datasets. A user licence is required for combining.
  2. Unit-level datasets other than those of Statistics Finland: combining is possible only if a user licence has been obtained from the owner of the other dataset. Statistics Finland's user licence is also needed for combining.
  3. Public statistical data of statistical authorities: combining is possible. No user licence is required for combining.
  4. Public statistical data of other than statistical authorities: combining is possible. A user licence is required for combining.

1. Combining with interview or survey datasets

When combining datasets collected with interviews and surveys with other datasets in a research dataset, the processing of personal data must be transparent for the data subject.

Informing already at the beginning of the data collection

Those participating in interviews and surveys must be informed about combining the data with register data already when they are contacted for the first time regarding the data collection.

Respondents must be informed about:

  • the purpose for the processing of personal data
  • the controller
  • the data groups extracted from different registers
  • the data release targets.

Combining with unit-level datasets

If Statistics Finland's individual-level data are to be combined with datasets collected with an interview or a survey, the participant must be informed of this.

The respondents must be told that data are also retrieved from Statistics Finland. In addition, they must be informed of the personal data groups from which data are meant to be combined. Personal data groups can be such as income, education, labour force status.

In informing it is also required to specify for which time period the data are to be combined. The data may concern only the time of the survey or, for example, work and educational history.

Certainty with an early user licence

We recommend that you apply for a user licence for combining survey data with register data even before conducting the survey. Then questions related to informing respondents will be handled in connection with the licence processing even before collecting the data.

2. Combining with unit-level datasets of other authorities

If unit-level datasets are to be combined with unit-level datasets of others than Statistics Finland, a user licence is needed from both the other party and Statistics Finland.

Cause of death datasets are an exception. If they are combined with other unit-level datasets and the produced dataset is used in the FIONA remote access environment, a user licence is needed both from Findata and Statistics Finland. However, if cause of death datasets are released or used in Findata's remote access system Kapseli, then Findata’s user licence is sufficient.

3. Combining with public statistical data of statistical authorities

The datasets can be combined with public data of other authorities without a user licence. In addition to Statistics Finland, other statistical authorities are Finnish Customs, the Finnish Institute for Health and Welfare (THL) and the Natural Resources Institute Finland (Luke).

4. Combining with public statistical data of other than statistical authorities

Datasets can be combined with public statistical data of other than statistical authorities. The combining requires a user licence from Statistics Finland.

Instructions for delivering the dataset to be combined

The datasets to be combined are usually delivered through a data transfer service. In order for the dataset to be transferred correctly, the datasets must be named according to the instructions. Read the instructions carefully, because misnamed datasets will not be transferred correctly.