Research Description
The salaries of civil servants are an important indicator reflecting the effectiveness of public administration, transparency, and fairness in the distribution of state resources. The analysis of civil servants’ salaries makes it possible to assess and understand the dynamics of this indicator, identify gender and territorial differences, and conduct a comparative analysis across different position categories.
The purpose of this analytical study is to analyze both the overall dynamics of civil servants’ salaries and the breakdown by job categories and regions, as well as to examine the gender component.
The study consists of four main sections. The first section analyzes the dynamics of civil servants’ salaries for the years 2020–2023, providing an overview of the general trends in salary increases or decreases over the analyzed period.
The second section examines the differences in salaries between men and women in public service and identifies possible gender disparities. The third section analyzes salaries depending on job categories, which allows for identifying possible differences in pay levels between various groups of positions. The fourth section focuses on the regional dynamics of civil servants’ salaries, revealing territorial differences in payment levels.
Data Specifics from Declarations
The data for this study were collected via the API of the Unified State Register of Declarations of Persons Authorized to Perform State or Local Government Functions, covering the years 2020–2023.
In total, over 688,000 declarations were selected for analysis from individuals who indicated that their positions belong to the “Civil Service Position” category.
The declaration data are filled in personally by a large number of individuals and may contain errors, including technical ones, particularly in the section specifying the total salary received by civil servants, as well as in other parts of the declaration.
To correct salary data to valid values, 10,000 declarations were manually labeled to identify those containing errors. Based on this labeled dataset, a machine learning algorithm (Random Forest) was used to build a predictive model that estimates the likelihood of errors in declarations. More than 1,500 declarations were identified as containing errors in the reported annual salary amounts of civil servants. Appropriate downscaling coefficients were applied to these declarations.
Missing information regarding the region where a civil servant works was filled in when the institution employing the declarant had a defined location within the general dataset. The gender of the declarant was determined based on the person’s first name and patronymic. The presented analysis should be used solely to assess general trends and patterns, as the declaration data may contain errors affecting the final results. However, considering the law of large numbers and the substantial size of the sample, it can be stated that the average values of indicators are likely to reflect reality. For accurate data representation in analysis, verification of all declaration data is required.