AI Ethics and Fairness Resources

AI and data ethics and fairness is becoming a very hot topic lately. With computer vision models not being able to see everyone equally to the debacle at Google's AI division, it's something that we all need to look out for when doing any type of work with data.

With that, I'd like to show some resources I found that has been useful when researching this topic. Some are videos that go over how bias can get into data and others are actual research papers that go over how to help mitigate bias.

For a video version of this post, check below:

Videos

There are quite a lot of videos that go over AI ethics. Below are a few of my favorites that have a good amount of information in them.

  • The Trouble with Bias by Kate Crawford - This talk, given at the Neural Neural Information Processing Systems (NIPS) in 2017. Not only does Kate goes over what exactly is bias in machine learning models, but she also goes over the harms that it can cause.

  • Machine Learning and Fairness by Hanna Wallach and Jennifer Wortman Vaughan - This is actually one of my favorite resources on the list. This video goes into several aspects of fairness in machine learning including types of bias that can be in your data as well as ways to help mitigate it such as the Datasheets for Data paper that's linked in the papers section.

  • Transparency and Intelligibility Throughout the Machine Learning Life Cycle by Jennifer Wortman Vaughan - This goes through the entire machine learning life cycle to best incorporate transparency throughout the life cycle.

Courses

There are a couple of courses that go over AI ethics and I believe more will be on the way as time goes on.

  • FastAI Ethics - FastAI's ethics course is probably one of the most comprehensive out there. It has several lectures and each lecture has supplemental materials such as articles and even research papers.

Books

Just like courses are coming to teach people about AI ethics, books are also coming to do the same and also to help how you can prevent bias from creeping into your models.

  • Interpretable AI by Ajay Thampi - One of the first books I've seen on this subject, this book helps you understand why the need for having models that are interpretable and shows how to do it.

Papers and Documents

A lot of the information in the other categories come from earlier research done on data bias and AI ethics. As a result of the research some documents have also come out of it to help people creating models to mitigate the amount of bias in their data.

  • Manipulating and Measuring Model Interpretability - This paper goes into how to measure model interpretability. It also helps answer the question about what is interpretability in terms of a machine learning model.

  • Datasheets for Datasets - In electronics, there is a datasheet accompanied by each component that describes its characteristics, any testing done on it, etc. This paper proposes the idea of having the same for machine learning data.

  • AI Fairness Checklist - This document has a checklist that one can follow throughout the lifecycle of creating a model to lookout for fairness.

Tools

Thankfully, there are some tools out there that can help us interpret how models are making their predictions as well as assessing fairness within the models.

  • Microsoft Fairlearn - This Python tool helps access the fairness in your data. There is a demo available for this that helps show how it works.

  • Microsoft InterpretML - Another Python tool to help interpret machine learning models. This one also has a demo available.

Hopefully, this list gave you a good idea about data and AI ethics and fairness. There are definitely many more resources out there and I have been partial to Microsoft for their research and resources.

There will be more posts on ethics and fairness in the future, as well. Especially covering the two tools from Microsoft, Fairlearn and InterpretML.

Data Anonymization

With all the news about the Cambridge Analytica scandal that has happened, there's a lot of critical discussions going on about data privacy. With that, I've found a few resources that can help us as data scientists with data anonymation and privacy.

The first one is a video by Katharina Rasch who goes over some tips on how we can anonymize data better.

Another resource comes from Microsoft. They have a course in data ethics and law. An interesting offering for a program about artificial intelligence, but an important one.