4 Digital Twin datahub

4.1 Purpose

The Digital Twin database contains data sets with indicators and parameters of the city. These allow users to solve monitoring problems and perform analysis and evaluation to facilitate forecasting and planning for the development of territories.

Analytically verified data sets on social, economic, and spatial development are designed to improve the reliability of analytical materials and evaluate the effectiveness of draft government resolutions.

4.2 Database description

Regularly updated analytically verified sets of machine-readable data for “City Digital Twin datahub” include 174 million values of indicators and parameters of social and economic development.

The database includes the following sections:

  • reference data
  • Social & Economic Development Indicators (SED) of territories
  • matrices of socio-economic ties (structural data)
  • database of investment projects, with technical and economic indicators and socio-economic effects of investment projects (management decisions), portfolios, and programs

The indicators are given in the following analytical measurements:

  • territories: a country, federal district, region, agglomeration, municipality, object address
  • time periods (years, including project start and end dates)
  • indicators (demographic, economic, social, etc.; basic, relative, and generalized; cumulative, instantaneous, and growth rates)
  • for economic indicators: industries and budget items
  • for demographic indicators: gender and age structure
  • scenarios (actual, reconstructed, estimated, target, and median)
  • versions: indicating the date of loading and estimation
  • data sources and recipients

The actual and reconstructed indicators of social and economic development serve as the basis for calculating (calibrating) the following matrices with parameters:

  • multipliers (intersectoral balances)
  • tensions (inter-territorial balances)
  • correlation (sensitivity)
  • conversion of indicators into their main components, providing connectivity and consistency of indicators

The prepared matrices are used to build forecasts, conduct What-if analysis and answer “What is needed” questions, assess impact, and form agreed parameters for development plans.

The datasets are collected from open sources and contain sets of actual SED indicator values since 2000 (for regions), since 2009 (for municipalities), and since 2012 (for sectoral indicators) and forecast indicator scenarios up until 2050 * for the Countries as a whole * for 2,300 municipalities * for 1,200 cities * for 22,000 settlements * for 128 industries

4.3 Datasets

4.3.1 Data model

City Digital Twin datasets are divided into 3 main interconnected domains:

  • master data (the green blocks on the picture)
  • fact-oriented datasets (divided by demographics, and economic and social facts)
  • document-oriented datasets

Fact- and document-oriented datasets are based on dimensions placed in master datasets.

Data model of City Digital Twin database

4.3.2 Master data

Master data tables are stored in the relational database PostgreDB DBMS. Access to master data tables is implemented through functions in the R package called DTwinDW.

Master data tables contain the attribute set of keys and include the following catalogs:

  1. Сatalog of Indicators (indicator)
  2. Сatalog of Territories (location)
  3. Сatalog of Industries (okved)
  4. Сatalog of Ages
  5. Сatalog of Genders (sex)
  6. Сatalog of Scenarios (scenario)
  7. Сatalog of Time Periods (time)
  8. Сatalog of Units of Measurement (unit)

The Catalog of Indicators contains information on basic, relative and generalized indicators of social and economic development and natural-anthropogenic development.

Quantitative metrics of master data:
Master-data table Records Attributes
1 kpi 1088 id, long_name, description, short_name, fact_type, code, unit_id, indicator_type_id, boo_code, en_long_name, hide, short_name2, esg, sd_model, rg_model, id_rosstat
2 location 3039 id, long_name, description, oktmo, okato, long_name_eng, stage, type, agglomeration, hide, region, actual, join_city, latitude, longitude, 100city, tz2022, parent, iso_country, sys_nuts, osm_level, complex_id, synonyms, currency
3 industry 121 id, okved, industry_geoveb, lvl1, lvl2, mob_61, mob_210, short_name, description, long_name, description_eng, industry_name_geoveb, gui_group, target2021, mob2021, short_name2
4 age 127 id, long_name, description, unique_name

4.3.3 Basic SED indicators

4.3.3.1 Regional indicatore

Regional indicators of social, economic, and industrial development based on data from official bodies of statistics and publicly available regional data, including:

  • Actual values (74,000 macroeconomic values, 1.3 million industry values) for the following groups of indicators:

  • demographic indicators (8 items)

  • economic indicators (63 items)

  • social and other indicators (19 items)

Region Indicator Year Value Region.code Indicator.code
Father Domestic product 2000 16178.9 58000000 C706
Asmakhan Population over working age 2018 277117.96 69000000 C042
Trupits Tax revenues of budgets of all levels 2017 113369.62 47000000 C708
Opossit Total area of ​​residential premises 2009 29892293.2 94000000 C149
Dassault City budget expenditures, including: Road facilities (road funds) 2020 10320 56000000 C074
Primavera City budget expenditures, including: Housing and communal services 2017 20529 04000000 C075
Stronger City budget revenues, including: State duty 2018 256 24000000 C058
  • Estimated indicators (21.4 million items) of social and economic development and strategic goals (143 groups of indicators)
Indicator Year Value Region.code Indicator.code
Tax revenues of budgets of all levels 2009 236.09 61000000 C708
Tax revenues of budgets of all levels 2023 0.88 53000000 C708
Final consumption 2001 631.32 98000000 C952
Gross value added 2007 1440.71 17000000 C784
Р24103 Current income tax (For the reporting period) 2029 0.62 18000000 C499
  • Analytics for the period 2000-2021 (actual data) and for 2020-2050 (inertial and investment forecasts)
  • Gender and age structure and migration flow estimations (including commuting) for demographic indicators
  • Industry analytical data segments for 85 industries (of level 1) and 128 industries (of level 2)
region industry indicator_name year value location industry.code Indicator.code
Region 1 Information technologies Income 2023 2760.29 46000000 63 C493
Region 2 Air & space industry All leve budget tax revenues 2023 5268.75 65000000 51 C708
Region 3 Machinery manufacturing Gross profit (loss) 2028 426.6 84000000 28 C479
Region 4 Creative industries CoGS 2024 915.25 67000000 90 C1048
Region 5 Delivery services Cost of sales 2027 9.19 81000000 53 C477
Region 6 Paper manufacturing Fixed capital 2024 0.17 08000000 17.2 C481
  • Municipal analytical data segments of 2,300 municipalities
  • Scenario analytical data segments - inertial scenario and investment scenarios (taking into account the impact of investment projects and programs available) of the social and economic development of territories
scenario indicator_name region year value location indicator
invest Growth of household earnings Region 1 2023 0.04 77000000 C003

4.3.3.2 Municipal indicators.

Municipal indicators (291 items) aggregated to the regional level from the municipal statistics database and the balance sheets of the Federal Tax Service, including:

  • Actual values (16 million macroeconomic and 24.4 million sectoral indicators) for the following groups of indicators:
    • demographic indicators (33 items)
    • economic indicators (142 items)
    • social and other indicators (69 items)

Municipal indicators * Estimated indicators (21.4 million items) of social, economic development and strategic goals (143 items)
* Analytics for the period 2009-2021 (actual data) and for 2020-2050 (inertial and investment forecasts)

Municipal indicator projection * Gender and age structure and estimated migration flows (including commuting) for demographic indicators.

Structural values * Industry analytical data segments of 85 industries (of level 1) and 128 industries (of level 2)
* Municipal analytical data segments - 24,221 settlements
* Scenario analytical data segments - inertial scenario and investment scenario (taking into account the impact of investment projects and programs available)

4.3.4 Matrices of socio-economic ties

Matrices of socio-economic ties are intended for the formation of analysis and estimates, forecasts, and scenario-based planning.

Matrices allow you to quickly estimate a variety of scenarios and evaluate the intersectoral and interterritorial impact of management decisions (investment projects and regulatory decisions) on socio-economic development.

  • Regional and municipal intersectoral balance matrices based on 128 industries of the 2nd level for the actual data for 2016-2021 and for the forecast period of 2022-2040 for calculating intersectoral and interterritorial effects, including tables:

    • Resources in terms of goods and services
    • Use of goods and services at buyers’ prices
    • Use of goods and services at basic prices
    • Use of domestic products at basic prices
    • Use of imported products
    • Trade and transport margins
    • Taxes (net of subsidies) on products
    • Matrix of multipliers (production and technological coefficients)
  • Interterritorial balances, including estimates of the directions of passenger traffic, cargo traffic, exports, and imports.

  • Correlation matrices (sensitivity matrices) for scenario calculations for the impact of management decisions on the dynamics of socio-economic development indicators. These include an assessment of the impact of basic indicators (30 indicators on 30 indicators) characterizing socio-economic development for 2020-2050.

  • Influence matrices (mutual influence through principal components and/or eigenvectors) of indicators (291 indicators x 7 indicators x 291 indicators), allowing observers to evaluate the impact of a scenario change in any indicator (taking into account the analytical data segment) from the database on the change of all indicators in the medium term (3-5 years) and long term (5-30 years).

4.3.5 Management Decisions

With the help of Digital Twin, it is possible to assess the impact of the inertial development of territories based on the following management decisions:

  • changes in monetary policy parameters, including (consumer price index, refinancing rate, and the tax burden)
  • parameters of strategies and forecasts for socio-economic and sectoral development
  • plans and programs for the development of territories and industries
  • industry schemes
  • investment projects, portfolios, and programs

To date, a portfolio of investment projects has been collected and evaluated. It includes data on investment projects (46 thousand projects) carried out from 2002 to 2025, mainly using federal budget funds.

Data sources are:

  • the federal investment program
  • electronic budgets
  • investment decisions
  • investment projects of financial institutions

The initial data on the impact of investment decisions on a project may contain the following characteristics:

  • volumes and sources of financing
  • location of financing objects
  • the type of capital construction object
  • its address
  • the start and completion dates of the project

Estimated indicators (146 items) of investment projects include an assessment of the indicators of the financial flow of the project (based on the median values of the target region) and an assessment of their impact on the dynamics of socio-economic development indicators.

4.4 Dataset Subscription Service

Dataset Subscription Service now is available on Dtwin.city

The subscription service includes the following regular (quarterly) works:

  • data preparation
  • obtaining programmatic access to external data sources with indicators of socio-economic and spatial development
  • profiling primary data sources
  • downloading and recognizing raw data
  • confirming the completeness of primary data from sources
  • maintenance of directories (indicators, analytical measurements, scenarios, versions, stages of processing), data models of sources and recipients, as well as correspondence tables, taking into account changes over time
  • documentation of changes in the composition and structure of data sources and recipients
  • documenting changes in accounting methods
  • data processing
  • elimination of technical errors in primary data formats, including field naming, value formats, shifts in data series, and gaps in analytical data segmentation
  • aggregation of downloaded primary data and reduction to a reference structure
  • detection of changes in primary data, including retrospectively, including in the data structure, directories, and actual values
  • recovery of gaps, elimination of duplication of data
  • validation and proposals for adjusting the values of indicators by econometric methods according to a set of rules, including
  • meeting balance ratios
  • being in the range of acceptable intervals
  • satisfying the ratios of the principal components (eigenvectors)
  • providing data
  • creation and support of a programming interface (REST API) to directories, sets of primary and analytically verified data
  • uploading value sets and reference data in csv, xlsx, parquet, qs, fst data formats
  • documentation of collection, processing, and provision processes
  • putting down scenarios (fact assessment, forecast, plan, scenario, and goal), stages (initial, amended, corrected, and stage number), versions, and methods of accounting indicator values, which is necessary for the correct interpretation and tracing of indicator values
  • maintaining a library of verification methods, rules for verification and validation of indicator values
  • preparation of an interactive report on the scope, completeness, and identified errors

Questions & proporals
All rights reserved Digital twin LLC