Data Ethics and Privacy in the SSIX Project

The SSIX Project Consortium recognises that data ethics and privacy issues need serious attention and to specifically address these matters. The Consortium has formed a Data and Business Ethics Board (DBEB), consisting of the project partners and independent ethics advisers. The board has produced a high-level ethics framework to address concerns with the objective of eliminating to highest level any issues while also constraining any overly adverse impact on the SSIX platform’s operation to a minimum.

The board has recommended that along with a Consent Manager on the website, this privacy page should be added explaining what data is being collected and what is being done with this data. By providing this transparency, the goal is to communicate with the public that the project is demonstrating strong ethical principles reassuring any concerns they may have.

What is the project's goal?

The SSIX projects aim is to provide European SME’s with a collection of easy to use tools to analyse and gauge the sentiment of social network users for any given topic; giving them valuable business intelligence which can be added to their decision-making process.

What data sources does the project use?

While SSIX is primarily focused on social networks like Twitter, the project will also use professional news feeds and blog content sources.

What does the project do with the collected data?

We strive to maintain the highest level of anonymity of all individual users in our work, only keeping data which is essential to the project's objectives. Once collected data is filtered to remove spam and irrelevant content, aggregated sentiment metrics will be produced by the SSIX NLP pipeline. We will destroy all personal data if it is no longer to be used for the project's purposes. 

Is the data which the project collects public?

Yes, the SSIX platform will only be collecting data from publicly available sources, this means in principle all relevant authorisation and consent have been provided by the owners of data. The project will only access social network content from the official APIs, users on these networks have given consent to the network to share their data with third parties, additionally social networks like Twitter allow users to make their account private.

However, the Data and Business Ethics Board recognises that the interpretation of the privacy laws vary across the EU and that social network data which is public might be considered private even if the user has given consent to the social network to share their data. This legal grey area is a concern but it is not practical for the project to get a double opt-in from social networks users, as this would require the users to voluntarily opt-in to SSIX data collection or for SSIX to contact every single Twitter user requesting consent to use their data, no similar analytics service performs this double opt-in.

To address this issue SSIX has provided a Consent Manager on the project website, which allows the public to request a blind opt-out from SSIXs data collection as the user will have no way of knowing if their data has been collected. The user will need to submit certain details enabling the system to identify and remove collected content from that user. If a participant voluntarily gives access to their social network account ID number, either via the SSIX Consent Manager apps or by email, they are sending only their account ID to the SSIX DBEB. From the date of receipt, we will destroy all request communications. For a participants content to be removed from SSIX activities, the account ID is added to a static blacklist table, all incoming account matching this blacklist will be automatically discarded.

The Consent Manager can be found at


How is data stored in the SSIX Project?

We have implemented a Data Management Plan (DMP). The DMP describes the data management life-cycle for all the data sets that will be collected, processed or generated by the project. The DMP is not a fixed document and will evolve during the lifespan of the project.

What data will be shared by the SSIX Project?

All data released by the project will need the approval of the SSIX Data and Business Ethics Board (DBEB). We have no plans to share data gathered outside the SSIX project team. One exception might be the EU Open Research Data Pilot if the board agrees to do so.

Transmission  of content to 3rd party services

In the NLP pipeline, the SSIX platform may make use of third-party systems during analysis. SSIX does not have the resources to cover every language with native classifiers and thus to perform analysis on unsupported languages the SSIX platform will use machine translation to convert from the unsupported language to one which is supported via a native classifier. Currency SSIX is using project partner Lionbridge's GeoFluent API to perform these translation tasks. During the process only tweet message content is sent out by SSIX.

Additionally, the SSIX platform may make use project partner Redlink's API suite to perform further analysis.

Summary of the data ethics issues identified within the scope of the SSIX platform activities:

  1. Privacy & Security: Collection and storage of data.
    1. SSIX will collect publicly available data from social networks, blogs and news providers.
    2. Where data are not publicly available or if specific authorisation is needed, relevant Informed Consent will be obtained before any collection, use or processing of relevant data.
    3. SSIX provides social network users a blind opt-out from data collection via the project website.
    4. Collected data will be sorted securely as outlined in the projects Data Management Plan.
    5. The project has agreed to Open Research Data Pilot (ORDP). However, the project will not share any collected data or analytics with any third party without approval of the DBEB.
    6. All collected data will be destroyed if it is no longer to be used for the project's purposes.
  2. Analytics: Creating aggregated insight from collected data.
    1. Certain identifiable user data needs to be stored for filtering and categorisation. The following details will not be used - name, address, age, sex, photos, date of birth, photos, metadata. Data Quality is of fundamental importance to the project outcomes so strong filtering rules are needed due to the prevalence of bots, fake data, spammers, etc.
    2. Data such as a user ID is required to remove spam messages. User profile description will be needed to classify some users into a domain category, such as a Twitter user who declare themselves as an Analyst or Trader. 
    3. General data may be kept to categorise users by location, location detail would be no greater than a regional level.
  3. Business Ethics: How the aggregated metrics will be used.
    1. The end analytics from the platform are aggregated and do not target any one user.
    2. Group Data - only high-level domain grouping from professional fields.
    3. The project has no plans of identifying individuals or building up a profile on individuals user or aggregation of a user across multiple social networks, a group of individuals.
    4. The project does not want to do anything discriminating on any of the current sensitive grounds.



What are cookies? A “cookie” is a text file that websites send to a visitor’s computer or other Internet-connected device to uniquely identify the visitor’s browser or to store information or settings in the browser.

What types of cookies are used on the Sites and what choices do you have? Below is a list of the cookies that we would like to use on our Sites. You can indicate your acceptance of all cookies or a selection of cookies that you allow us to use on our Sites. Please note that, depending on your selection, you may not be able to take full advantage of the Sites. Please also understand that a cookie will be placed on your device to allow us to remember your choice.

Cookie Type Description
Functional Cookies Functional cookies allow the Sites to help maintain your session and remember choices you have made in order to provide functionality for your benefit. Functional cookies are used to collect information about your language preferences and other preferences indicated during your visit to the Sites. For example, functional “session” cookies allow the Sites to remember settings specific to you, such as your country selection, which ultimately improves your web experience. Functional cookies also include persistent cookies which are used to remember your preferences when you subsequently visit our Sites.
Analytic Cookies Analytic cookies are used to gather statistics about the use of the Sites in order to improve the performance and design of the Sites and our services. For these purposes, analytic cookies collect information about your device type, operating system type, browser type, domain, other system settings, IP-address, referring URLs, information on actions taken on the Sites and the dates and times of your visits, as well as the country and time zone in which your device is located. These cookies are provided by our third-party analytics tool provider, Google Analytics, and the information obtained through these cookies will be disclosed to or collected directly by this third-party service provider. For more information about Google’s cookie and information practices (including the types of cookies used and their expiration date), please visit the link below:

Google Analytics -

Third Party Cookies We partner with third parties to provide you with connections to certain social networks, such as Twitter, and to provide you with additional features, such as YouTube and Google Docs. By engaging with third-party plug-ins and widgets on our Sites, such third parties may place session or persistent cookies on your browser. The use of these cookies is subject to such third party’s own cookie policies linked below:


How do I disable or remove cookies? You may restrict or disable the use of cookies through your web browser. Each type of web browser offers ways to restrict and delete cookies. For more information on how to manage cookies, please visit the appropriate link below.

Internet Explorer
Google Chrome


Where can I get more information?

You can contact the Data and Business Ethics Board with