Skills and Competencies for enabling Open Data

in public administration

Augusto Herrmann
23.4.2024

Who am I

  • Augusto Herrmann – https://herrmann.tech, background:
    • Math & CS (BSc. 2004-2007)
    • KM, IT Mgt., NLP (MSc. 2009-2011)
  • civil servant since 2004.
    • national open data policy (2010-2018)
    • data engineering (2020-current)

Where to find this presentation

https://herrmann.tech/slide-decks/2024/04/skills-and-competencies-for-enabling-open-data

Creative Commons License licensed under a Creative Commons Attribution 4.0 International License.

2010: A bottom-up open data policy

  • a team of 3, build a new policy without a budget
  • Dawn of the Open Government Partnership. Principles: Transparency, Collaboration, Participation
  • Influence: Civil Servant 2.0 – Davied van Berlo (2008)
  • The solution? Use the collective intelligence!

2011: A data portal built by citizens

  • Anyone interested could participate
    • processes, plans and schedule documented in the open
    • people of diverse backgrounds
    • meetings in informal places outside government premises

2012: An infrastructure for open data

2016: Open Data Plans

  • Decree 8.777: establishes an open data planning cycle that every ministry and agency must follow
  • Capacity building workshops, in just over a year:
    • over 700 have attended in-person
    • almost 1,800 attendants in online course

Transition

  • 2019: open data policy management transferred to Office of the Comptroller-General – "CGU" (Decree 9.903)
  • 2020: Secretariat for Management and Innovation
    • Data management and data governance
    • responsible for overarching systems: data from all ministries and agencies
    • Apache Airflow for orchestrating data pipelines
      • also for Open Data where applicable!

Remaining Challenges

  • Data integration across national and local levels of government
  • Integration of international data and developing standards
  • Establish an effective feedback loop with data users
  • Identifying high value data and fostering data use
  • Maturity in data management and data governance
  • Data anonymization
  • Automatic data licensing

Data integration across levels of government

  • Brazil has 5,570 municipalities
    • even keeping track of every data portal URL is hard
  • federated data portals: make search easier for citizens
  • develop common data standards
  • fostering a network of public officials across local administration for discussing and sharing experiences on open data
  • leverage interested civil society groups and projects (e.g. Querido Diário, Frag den Staat)

International data integration and data standards

Effective feedback loop with data users

  • data portals must have a feedback mechanism so data users can report errors, ask for clarification, etc.

Identifying high value data and fostering data use

  • Open Data Charter / Barometer / Index list of datasets are a start but not enough
  • public consultations for open data plans are important but not enough
  • data publishers must engage with the data using communities, e.g. in data science / data engineering forums
  • participate in events such as Open Data Day
  • data publishers should hold events to showcase their best data

Maturity in data management and data governance

  • the organization should build capacity for public officials
  • the departments and people responsible for data should be defined and easy to find out
  • have an established data lifecycle
  • make open data updates periodic and automatic
  • data should be well documented internally, this will in turn enable better documentation for open data

Data anonymization

  • should be treated as a security issue (risk of re-identification)
    • e.g. have an internal "red team" try do re-identify anonymized data
  • omit or mask sensitive fields
  • data aggregation is helpful, but care should be taken
  • assess risks and document every decision

Automatic data licensing

  • licensing issues are complex
  • depend on national copyright / database right laws
  • legal analysis should be centralized, achieve an all encompassing solutions
  • license choice should come automatically from the top – neither middle level officials nor developers can't be expected to deal with such complex legal issues
  • usually best to use:

Thank you

Contact

Questions & feedback

👆❓ are welcome and appreciated