As the stewards of ethical web data collection, the Ethical Web Data Collection Initiative (EWDCI) believes that clarity, accountability, and collaboration are foundational to a thriving, responsible web data ecosystem. Our four core principles guiding this initiative are:
- Legality
- Ethics
- Social Responsibility
- Ecosystem Engagement
We look to these principles as we engage in conversations with policymakers to protect this industry’s foundational data-collection objectives. We have identified four primary policy priorities, which we pursue to maintain and nurture a healthy data collection industry that can positively shape outcomes in customer experience, machine learning, and data security.
EWDCI Policy Priorities
As we describe our policy priorities, you will see how they connect with our core principles in a concrete way.
1—Publicly accessible information should not be subject to unreasonable proprietary restrictions that limit interoperability, research, competition, or innovation.
Restrictions on information accessibility chill innovation and disproportionately harm smaller actors. Our efforts here are guided by our Ethics and Ecosystem Engagement principles.
2—Legitimate access is access to data that is published on the Internet which is not restricted by a log-in.
We advocate for lawful access to data. This means that we respect paywalls and authentication systems that are designed to limit access to the general public, as well as restricted AI agents.
This work is based upon our Legality and Ecosystem Engagement principles.
3—Copyright law protects creative expression and does not extend to facts themselves.
Copyright protects original expression only. Facts are not copyrightable because they are discovered, not created. In advocating for this idea, we are guided by our Legality, Ethics, and Social Responsibility principles.
4—Restricting access to public data threatens the continued development of the Internet.
Data that is publicly posted but legally inaccessible creates incoherent policies. This scenario benefits incumbents who already have the data while blocking new entrants, researchers, and the public. This asymmetry of access distorts markets and concentrates power. EWDCI seeks to preserve meaningful access to public information. This supports competition, academic research, journalism, security research, and the continued openness of the Internet ecosystem.
All four of our core principles—Legality, Ethics, Social Responsibility, Ecosystem Engagement—guide us here.
The Public Deserves Digital Peace Of Mind
By explicitly surfacing these priorities, we hope to inspire others working in the web data collection industry to join us in making the Internet a better place for everyone through strengthening public trust, promoting ethical guidelines, and helping businesses make informed data aggregation choices.
[Key image: Photo by Mauricio Artieda on Unsplash
