4 Digital Twin datahub
4.1 Purpose
The Digital Twin database contains data sets with indicators and parameters of the city. These allow users to solve monitoring problems and perform analysis and evaluation to facilitate forecasting and planning for the development of territories.
Analytically verified data sets on social, economic, and spatial development are designed to improve the reliability of analytical materials and evaluate the effectiveness of draft government resolutions.
4.2 Database description
Regularly updated analytically verified sets of machine-readable data for “City Digital Twin datahub” include 174 million values of indicators and parameters of social and economic development.
The database includes the following sections:
- reference data
- Social & Economic Development Indicators (SED) of territories
- matrices of socio-economic ties (structural data)
- database of investment projects, with technical and economic indicators and socio-economic effects of investment projects (management decisions), portfolios, and programs
The indicators are given in the following analytical measurements:
- territories: a country, federal district, region, agglomeration, municipality, object address
- time periods (years, including project start and end dates)
- indicators (demographic, economic, social, etc.; basic, relative, and generalized; cumulative, instantaneous, and growth rates)
- for economic indicators: industries and budget items
- for demographic indicators: gender and age structure
- scenarios (actual, reconstructed, estimated, target, and median)
- versions: indicating the date of loading and estimation
- data sources and recipients
The actual and reconstructed indicators of social and economic development serve as the basis for calculating (calibrating) the following matrices with parameters:
- multipliers (intersectoral balances)
- tensions (inter-territorial balances)
- correlation (sensitivity)
- conversion of indicators into their main components, providing connectivity and consistency of indicators
The prepared matrices are used to build forecasts, conduct What-if analysis and answer “What is needed” questions, assess impact, and form agreed parameters for development plans.
The datasets are collected from open sources and contain sets of actual SED indicator values since 2000 (for regions), since 2009 (for municipalities), and since 2012 (for sectoral indicators) and forecast indicator scenarios up until 2050 * for the Countries as a whole * for 2,300 municipalities * for 1,200 cities * for 22,000 settlements * for 128 industries
4.3 Datasets
4.3.1 Data model
City Digital Twin datasets are divided into 3 main interconnected domains:
- master data (the green blocks on the picture)
- fact-oriented datasets (divided by demographics, and economic and social facts)
- document-oriented datasets
Fact- and document-oriented datasets are based on dimensions placed in master datasets.
4.3.2 Master data
Master data tables are stored in the relational database PostgreDB DBMS. Access to master data tables is implemented through functions in the R package called DTwinDW.
Master data tables contain the attribute set of keys and include the following catalogs:
- Сatalog of Indicators (indicator)
- Сatalog of Territories (location)
- Сatalog of Industries (okved)
- Сatalog of Ages
- Сatalog of Genders (sex)
- Сatalog of Scenarios (scenario)
- Сatalog of Time Periods (time)
- Сatalog of Units of Measurement (unit)
The Catalog of Indicators contains information on basic, relative and generalized indicators of social and economic development and natural-anthropogenic development.
Quantitative metrics of master data:№ | Master-data table | Records | Attributes |
---|---|---|---|
1 | kpi | 1088 | id, long_name, description, short_name, fact_type, code, unit_id, indicator_type_id, boo_code, en_long_name, hide, short_name2, esg, sd_model, rg_model, id_rosstat |
2 | location | 3039 | id, long_name, description, oktmo, okato, long_name_eng, stage, type, agglomeration, hide, region, actual, join_city, latitude, longitude, 100city, tz2022, parent, iso_country, sys_nuts, osm_level, complex_id, synonyms, currency |
3 | industry | 121 | id, okved, industry_geoveb, lvl1, lvl2, mob_61, mob_210, short_name, description, long_name, description_eng, industry_name_geoveb, gui_group, target2021, mob2021, short_name2 |
4 | age | 127 | id, long_name, description, unique_name |
4.3.3 Basic SED indicators
4.3.3.1 Regional indicatore
Regional indicators of social, economic, and industrial development based on data from official bodies of statistics and publicly available regional data, including:
Actual values (74,000 macroeconomic values, 1.3 million industry values) for the following groups of indicators:
demographic indicators (8 items)
economic indicators (63 items)
social and other indicators (19 items)
Region | Indicator | Year | Value | Region.code | Indicator.code |
---|---|---|---|---|---|
Father | Domestic product | 2000 | 16178.9 | 58000000 | C706 |
Asmakhan | Population over working age | 2018 | 277117.96 | 69000000 | C042 |
Trupits | Tax revenues of budgets of all levels | 2017 | 113369.62 | 47000000 | C708 |
Opossit | Total area of residential premises | 2009 | 29892293.2 | 94000000 | C149 |
Dassault | City budget expenditures, including: Road facilities (road funds) | 2020 | 10320 | 56000000 | C074 |
Primavera | City budget expenditures, including: Housing and communal services | 2017 | 20529 | 04000000 | C075 |
Stronger | City budget revenues, including: State duty | 2018 | 256 | 24000000 | C058 |
- Estimated indicators (21.4 million items) of social and economic development and strategic goals (143 groups of indicators)
Indicator | Year | Value | Region.code | Indicator.code |
---|---|---|---|---|
Tax revenues of budgets of all levels | 2009 | 236.09 | 61000000 | C708 |
Tax revenues of budgets of all levels | 2023 | 0.88 | 53000000 | C708 |
Final consumption | 2001 | 631.32 | 98000000 | C952 |
Gross value added | 2007 | 1440.71 | 17000000 | C784 |
Р24103 Current income tax (For the reporting period) | 2029 | 0.62 | 18000000 | C499 |
- Analytics for the period 2000-2021 (actual data) and for 2020-2050 (inertial and investment forecasts)
- Gender and age structure and migration flow estimations (including commuting) for demographic indicators
- Industry analytical data segments for 85 industries (of level 1) and 128 industries (of level 2)
region | industry | indicator_name | year | value | location | industry.code | Indicator.code |
---|---|---|---|---|---|---|---|
Region 1 | Information technologies | Income | 2023 | 2760.29 | 46000000 | 63 | C493 |
Region 2 | Air & space industry | All leve budget tax revenues | 2023 | 5268.75 | 65000000 | 51 | C708 |
Region 3 | Machinery manufacturing | Gross profit (loss) | 2028 | 426.6 | 84000000 | 28 | C479 |
Region 4 | Creative industries | CoGS | 2024 | 915.25 | 67000000 | 90 | C1048 |
Region 5 | Delivery services | Cost of sales | 2027 | 9.19 | 81000000 | 53 | C477 |
Region 6 | Paper manufacturing | Fixed capital | 2024 | 0.17 | 08000000 | 17.2 | C481 |
- Municipal analytical data segments of 2,300 municipalities
- Scenario analytical data segments - inertial scenario and investment scenarios (taking into account the impact of investment projects and programs available) of the social and economic development of territories
scenario | indicator_name | region | year | value | location | indicator |
---|---|---|---|---|---|---|
invest | Growth of household earnings | Region 1 | 2023 | 0.04 | 77000000 | C003 |
4.3.3.2 Municipal indicators.
Municipal indicators (291 items) aggregated to the regional level from the municipal statistics database and the balance sheets of the Federal Tax Service, including:
- Actual values (16 million macroeconomic and 24.4 million sectoral indicators) for the following groups of indicators:
- demographic indicators (33 items)
- economic indicators (142 items)
- social and other indicators (69 items)
- demographic indicators (33 items)
* Estimated indicators (21.4 million items) of social, economic development and strategic goals (143 items)
* Analytics for the period 2009-2021 (actual data) and for 2020-2050 (inertial and investment forecasts)
* Gender and age structure and estimated migration flows (including commuting) for demographic indicators.
* Industry analytical data segments of 85 industries (of level 1) and 128 industries (of level 2)
* Municipal analytical data segments - 24,221 settlements
* Scenario analytical data segments - inertial scenario and investment scenario (taking into account the impact of investment projects and programs available)
4.3.4 Matrices of socio-economic ties
Matrices of socio-economic ties are intended for the formation of analysis and estimates, forecasts, and scenario-based planning.
Matrices allow you to quickly estimate a variety of scenarios and evaluate the intersectoral and interterritorial impact of management decisions (investment projects and regulatory decisions) on socio-economic development.
Regional and municipal intersectoral balance matrices based on 128 industries of the 2nd level for the actual data for 2016-2021 and for the forecast period of 2022-2040 for calculating intersectoral and interterritorial effects, including tables:
- Resources in terms of goods and services
- Use of goods and services at buyers’ prices
- Use of goods and services at basic prices
- Use of domestic products at basic prices
- Use of imported products
- Trade and transport margins
- Taxes (net of subsidies) on products
- Matrix of multipliers (production and technological coefficients)
- Resources in terms of goods and services
Interterritorial balances, including estimates of the directions of passenger traffic, cargo traffic, exports, and imports.
Correlation matrices (sensitivity matrices) for scenario calculations for the impact of management decisions on the dynamics of socio-economic development indicators. These include an assessment of the impact of basic indicators (30 indicators on 30 indicators) characterizing socio-economic development for 2020-2050.
Influence matrices (mutual influence through principal components and/or eigenvectors) of indicators (291 indicators x 7 indicators x 291 indicators), allowing observers to evaluate the impact of a scenario change in any indicator (taking into account the analytical data segment) from the database on the change of all indicators in the medium term (3-5 years) and long term (5-30 years).
4.3.5 Management Decisions
With the help of Digital Twin, it is possible to assess the impact of the inertial development of territories based on the following management decisions:
- changes in monetary policy parameters, including (consumer price index, refinancing rate, and the tax burden)
- parameters of strategies and forecasts for socio-economic and sectoral development
- plans and programs for the development of territories and industries
- industry schemes
- investment projects, portfolios, and programs
To date, a portfolio of investment projects has been collected and evaluated. It includes data on investment projects (46 thousand projects) carried out from 2002 to 2025, mainly using federal budget funds.
Data sources are:
- the federal investment program
- electronic budgets
- investment decisions
- investment projects of financial institutions
The initial data on the impact of investment decisions on a project may contain the following characteristics:
- volumes and sources of financing
- location of financing objects
- the type of capital construction object
- its address
- the start and completion dates of the project
Estimated indicators (146 items) of investment projects include an assessment of the indicators of the financial flow of the project (based on the median values of the target region) and an assessment of their impact on the dynamics of socio-economic development indicators.
4.4 Dataset Subscription Service
Dataset Subscription Service now is available on Dtwin.city
The subscription service includes the following regular (quarterly) works:
- data preparation
- obtaining programmatic access to external data sources with indicators of socio-economic and spatial development
- profiling primary data sources
- downloading and recognizing raw data
- confirming the completeness of primary data from sources
- maintenance of directories (indicators, analytical measurements, scenarios, versions, stages of processing), data models of sources and recipients, as well as correspondence tables, taking into account changes over time
- documentation of changes in the composition and structure of data sources and recipients
- documenting changes in accounting methods
- data processing
- elimination of technical errors in primary data formats, including field naming, value formats, shifts in data series, and gaps in analytical data segmentation
- aggregation of downloaded primary data and reduction to a reference structure
- detection of changes in primary data, including retrospectively, including in the data structure, directories, and actual values
- recovery of gaps, elimination of duplication of data
- validation and proposals for adjusting the values of indicators by econometric methods according to a set of rules, including
- meeting balance ratios
- being in the range of acceptable intervals
- satisfying the ratios of the principal components (eigenvectors)
- providing data
- creation and support of a programming interface (REST API) to directories, sets of primary and analytically verified data
- uploading value sets and reference data in csv, xlsx, parquet, qs, fst data formats
- documentation of collection, processing, and provision processes
- putting down scenarios (fact assessment, forecast, plan, scenario, and goal), stages (initial, amended, corrected, and stage number), versions, and methods of accounting indicator values, which is necessary for the correct interpretation and tracing of indicator values
- maintaining a library of verification methods, rules for verification and validation of indicator values
- preparation of an interactive report on the scope, completeness, and identified errors
Questions & proporals
All rights reserved Digital twin LLC