This case study builds upon the article feedback and ethical map (inserted below), it explores internal cultural factors and their origin, distinguishing between stated values and everyday norms, that are enabling bad data practices and solutions for addressing ethical considerations in data storing, such as privacy, security, and responsible use of personal information.
A business use case for ethical storage data
Data warehouses (DW) have become essential to many organisations as they provide a central data repository that involves collecting, storing, and managing large amounts of data in a centralised repository. “Hardware and software vendors such as IBM, Oracle, Informix®, Sun, Compaq, EMC, Veritas have issued a variety of white papers and tuning recommendations.” (Nicola, 2002, p.2) If these recommendations are embodied then DW can support the decision-making processes, identification of trends and patterns, and analysis of customer behaviour.
“Oracle Corporation, a leading player in database and data warehousing technologies, believes there are only three major steps in building a data warehouse: [Feed, Store and Use.]” (Baker & Baker, 1999, p.35) The final step, having a plan for using the data, is crucial for organisations to increasingly turn to DW and have the means to gain a competitive advantage in their industry. W.H. (Bill) Inmon, the recognised father of DW and founder of Pine Cone Systems, Inc., a leading DW software and consulting firm believed that the first interaction of building a DW would only take a few months, while the last iteration of the DW may take years and is an ongoing process. (Baker & Baker, 1999, 36) A well established central data repository provides businesses with a secure, reliable, and consistent source of information. However, in addition to the obvious business benefits, continued DW investment over the long term can also offer significant ethical, cultural and regulatory help to which employees should pay attention.
In today’s digital age, organisations hold vast amounts of data, such as personal, health, financial, geolocation, behavioural, and biometric, increasing the need for data governance and compliance. DW solutions have been implemented successfully across several industries including “[…] manufacturing (for order shipment and customer support), retail (for user profiling and inventory management), financial services (for claims analysis, risk analysis, credit card analysis, and fraud detection), transportation (for fleet management), telecommunications (for call analysis and fraud detection), utilities (for power usage analysis), and healthcare (for outcomes analysis).” (Chaudhuri, & Dayal, 1997, p.1)
In this article, we will limit the scope of data subjects to personal data. Various benefits and harms can be associated with different factors, such as the data source, collection purpose, and usage and storage. Benefits of personal data include personalised marketing and improved customer service, while harms include privacy violations, identity theft, and discrimination. It is important to note that a person can be identified, directly or indirectly, via Identifiers such as a name, an ID number, location data, or via factors specific to the person’s physiological, genetic, economic, cultural, or social identity aspects. This type of data, when not shared via appropriate channels, can blur ethical boundaries when it becomes unconnected to its origin.
Issues: Admin, Tech, Legal, Social, Monetary and Political
Centralising personal data in a secure DW eliminates administrative challenges from improper data handling to ensure protection, without duplications, accessible only by authorised and trained personnel, consistent, encrypted and confidential, accurate, and adequately documented. The increasing amounts of data from various sources lead to data discrepancies that must be recognised and resolved prior to making decisions based on inaccurate information. (Abrahamet et al., 2019) Moreover, efficient data storage should encompass the preservation of a single instance of data and elimination of duplicate copies, reducing storage overhead and enhancing load times. (Prajapati et al., 2022)
Compliance with data privacy laws such as General Data Protection Regulation (EU GDPR) and the Privacy Act 1988 (Privacy Act) is crucial to prevent legal issues from data sharing without consent, inadequate internal processes, and breaches of confidentiality. A robust data privacy framework enables organisations to swiftly adapt to regulation changes and ensure data privacy and security. Sharing data without consent is illegal under the Privacy Act.
IBM’s IMS (Information Management System), which was released in 1966 and based on a hierarchical database, was developed to manage the extensive bill of materials for the Saturn V moon rocket and Apollo space vehicle. (Praveen, Chandra and Wani, 2017, p.35) “Database systems from Oracle, Informix, IBM, and others can handle these large-scale data access demands, and these vendors will no doubt continue to expand as businesses perform even more detailed analysis.” (Baker & Baker, 1999, 37) Technology issues can occur from external data storage, insufficient internal data, outdated/redundant systems, slow performance, and weak security, driving up IT costs. Due to the demand for decentralisation in enterprises and increased geographical dispersion, there is a necessity for database replication (such as IBM’s Informix) to provide location transparency to employees. (Moiz, et al., 2011, p.1) DWs resolve these by creating a centralised data repository, improving data accuracy and reducing conflicting information risk.
Social issues can stem from poor organisational culture, rigid structures, fear of reporting issues, top-down power dynamics, unrealistic deadlines, sensitive data sharing, and unequal data access. “Larry Bramblett, principal of Data Warehouse Solutions, LLC, says companies implementing data warehouses often fail when it comes to setting priorities and following through.” (Baker & Baker, 1999, 36) Otto’s second organisational dimension is the organisational form, such as the structure in which responsibilities are specified and assigned, and the process organisation […] data governance should not be seen as a ‘one size fits all’ approach. (Bronus, 2016, p.119) Dissimilarities in features and cost structures between on-premises and cloud service providers (CSP) infrastructures present both unique opportunities and potential drawbacks that organisations need to consider. (Kahn et al., 2022) Cloud users may distrust CSPs as stored data can be confidential and sensitive, leading to concerns about data control and unauthorised access. (Gupta et al., 2022) Over time, both business requirements and organisational structures evolve, which means that the technology employed for data warehousing applications will also need to adapt accordingly. (Baker & Baker, 1999, 37) To effectively fund, staff and support the data management function, executives must thoroughly comprehend and wholeheartedly adopt fundamental data management principles. (Mosley et al., 2009) In addition, inefficient reporting leads to missed business opportunities and limited insights.
|Conflicts of interest, Code of Professional Conduct|
Despite appropriate safeguards being in place, there can be other monetary and political issues that can arise if data is valuable, confidential and easy to monetise. Good data governance is equally important to managing conflicts of interest, especially in a large firm with multiple clients, to avoid breaches of obligations under the law and the Code of Professional Conduct.
Many Australians and businesses, including governments, entrust professionals with sensitive information. Therefore, incorporating good data practices in using a DW involves several key steps: data governance, data quality, data security, data privacy, data access/usage, and data analytics.
- Data Governance ensures responsible and transparent data management policies and procedures, including explicit customer consent, secure storage, ethical use (without harmful or discriminatory practices), and customer access to data and its retention.
- Data Quality focuses on accurate, complete, consistent, and timely data in the warehouse through regular checking and cleaning. “Data from various sources go through tremendous Extract, Transform and Loading (ETL) [processes] that clean, transform and modify data so that [it] could match multidimensional schema in DW.” (Abai et al., 2013, p.802)
- Data Security implements encryption, access controls, backups, and secure disposal to protect sensitive / embargoed data from unauthorised access or loss.
- Data Privacy follows privacy laws by safeguarding personal information and collecting only necessary data for business purposes.
- Data Access and Usage sets clear roles and responsibilities for accessing and using data, ensuring employee awareness and compliance, and following regulatory and privacy requirements for external data sharing.
- Data Analytics extracts meaningful insights through unbiased data analysis and interpretation, using the DW’s comprehensive view of operations to make informed decisions.
Data stewards can promote these steps through written policies and procedures, staff training and accountability with clear lines of authority for privacy and personal information security decisions. Humans, not machines or code, are responsible for ethical decision-making in data-driven processes due to the lack of consciousness, free will and moral emotions in machines and the ethical implications of their decisions determined by human values and intentions.
A crisis typically unfolds in three stages: before, during, and after. It’s crucial to prioritise prevention efforts. During the crisis, we realise that even well-intentioned companies and employees can face adversity. However, learning from our mistakes often occurs primarily after the crisis has passed.
To err is human, however unsafe practices/procedures should not be mistaken for human error. “Data governance includes a clearly defined authority to create and enforce data policies and procedures” (Brous, et al., 2016, p.120) There is a need for a Hippocratic Oath to do no harm and pit the wellbeing of the people first.
The bigger picture: Cultural benefits of a data warehouse
Adhering to ethical data practices protects individuals and benefits organisations through Corporate Social Responsibility (CSR) and creates a data-driven culture. A data-driven culture that uses data to inform decision-making, drive innovation, and promote a data-literate workforce leads to improved collaboration, and a continuous learning mindset. “To grasp the remarkable insights concealed inside such data, such data must be [analysed] and extracted, which is known as big data analytics.” (Kumar et al., 2022, p.96)
The long-term investment in a DW can improve collaboration and communication between teams leading to better outcomes by providing a centralised repository of data and fostering open relationships with data stakeholders. In addition, the principles of CSR promote adherence to ethical standards and best practices in creating a DW that supports sound decision-making, safeguards stakeholders, and strengthens the reputation of organisations through increased customer trust and reduced risk of data breaches.
Emphasis is placed on the impact of data stewards on the community, promoting community-led and controlled data practices over corporate interests and utilising existing solutions before turning to data-driven ones to empower and sustain communities, freeing them from exploitation. A DW is non-volatile-users are unable to modify or revise the data, this ensures that all users are utilising the same information. (Watson, 2002)
In conclusion, it is vital to understand the internal cultural factors that enable bad data practices and to take steps to address ethical considerations in data storage. Organisations can ensure that their data practices are ethical and practical by prioritising privacy, security, and responsible use of personal information. By implementing solutions that address these issues, we can foster a culture of good data practices and ensure that employees handle sensitive information with care.
Abai, N. H. Z., Yahaya, J. H., & Deraman, A. (2013). User requirement analysis in data warehouse design: a review. Procedia Technology, 11, 801-806.
Abraham, R., Schneider, J., & Brocke, J. (2019). Data governance: A conceptual framework, structured review, and research agenda. 49(1). International Journal of Information Management. 424-438. ISSN 0268-4012. https://doi.org/10.1016/j.ijinfomgt.2019.07.008
Baker, S., & Baker, K. (1999). The best little warehouse in business. Journal of Business Strategy, 20(2), 32-39.
Brous, P., Janssen, M., & Vilminko-Heikkinen, R. (2016). Coordinating decision-making in data management activities: a systematic review of data governance principles. In Electronic Government: 15th IFIP WG 8.5 International Conference, EGOV 2016, Guimarães, Portugal, September 5-8, 2016, Proceedings 15 (pp. 115-125). Springer International Publishing.
Chaudhuri, S. & Dayal, U. (1997). An overview of data warehousing and OLAP technology. SIGMOD Rec. 26(1). 65–74. https://doi.org/10.1145/248603.248616
Gupta, I., Singh, A. K., Lee, C. N., & Buyya, R. (2022). Secure data storage and sharing techniques for data protection in cloud environments: A systematic review, analysis, and future directions. IEEE Access.
Kahn, M. G., Mui, J. Y., Ames, M. J., Yamsani, A. K., Pozdeyev, N., Rafaels, N., & Brooks, I. M. (2022). Migrating a research data warehouse to a public cloud: challenges and opportunities. Journal of the American Medical Informatics Association, 29(4), 592-600.
Kumar, H., Soh, P. J., & Ismail, M. A. (2022). Big data streaming platforms: a review. Iraqi Journal for Computer Science and Mathematics, 3(2), 95-100.
Mosley, M., Brackett, M., & Earley, S. (2009). The DAMA Guide to The Data Management Body of Knowledge (1st ed.). The Data Management Association. 1-406.
Moiz, S. A., Sailaja, P., Venkataswamy, G., & Pal, S. N. (2011). Database replication: A survey of open source and commercial tools. International Journal of Computer Applications, 13(6), 1-8.
Nicola, M. (2002). ’Storage Layout and I/O Performance Tuning for IBM Red Brick Data Warehouse. IBM DB2 Developer Domain, Informix Zone.
Schlackl, F., Link, N., & Hoehle, H. (2022). Antecedents and consequences of data breaches: A systematic review. Information & Management, 103638.
Prajapati, P., & Shah, P. (2022). A review on secure data deduplication: Cloud storage security issue. Journal of King Saud University-Computer and Information Sciences, 34(7), 3996-4007.
Praveen, S., Chandra, U., & Wani, A. A. (2017). A literature review on evolving database. International Journal of Computer Applications, 162(9), 35-41.
Watson, H. J. (2002). Recent developments in data warehousing. Communications of the Association for Information Systems, 8(1), 1.
Federal Register of Legislation. (2022, December 17). Privacy Act 1998. Australian Government. <https://www.legislation.gov.au/Details/C2022C00361>.
Office of the Australian Information Commissioner. (2018, June 8). Australian entities and the EU General Data Protection Regulation (GDPR). Australian Government. <https://www.oaic.gov.au/privacy/guidance-and-advice/australian-entities-and-the-eu-general-data-protection-regulation>.