MaryAnn's Blog

Posts

Week 31: Datasets for fairness

April 18, 2018

This week, Diana and I looked for some datasets on which to evaluate our fairness criteria. This is no simple task. For one, the datasets need to be large enough to be split into training, calibration, and testing sets. Second, there needs to be some protected group attribute like sex, race, or age. Third, there needs to be some outcome ranking as well as some truth value. The truth is often the hardest attribute to find. In this post, I introduce some of the datsets we decided to use. I try to identify each of the necessary features that we need to properly test our fairness criteria. COMPAS The COMPAS dataset is an obvious choice because it is well ingrained in the literature about fairness. It has 18,000 observations (n=18000) and 1000 attributes (k=1000). Protected groups can be sex (81/19) or race (51/49). The outcome attribute can be the recidivism score, the score used in court that represents the risk that the accused will commit another crime. The truth attribute is a li...

Week 30: Returning to Fairness

April 18, 2018

With the LineUp paper submitted, it's now time to return to the fairness problem we were working on before. As a bit of a refresher for you and for me, I'll review what we've done with fairness so far and what remains to be done. Fairness in Ranking Up until now, the literature regarding fairness has focused on fairness in classification. As an illustrative example, consider fairness in hiring practices with repect to applicant sex. A job opening might receive 100 applicants evenly split into 50 males and 50 females. If males and females are equally capable of doing the job, then you might be concerned to see the hiring company mostly interviewing male applicants. To evaluate fairness, you would compare each applicant's capability (capable of performing duties, not capable) to their outcome (interviewed, not interviewed). In a completely fair scenario, the same proportion of capable applicants from each sex group would receive an interview. Small deviances from this...

Week 29: We submitted!

April 01, 2018

Yesterday we submitted the paper to IEEE VAST 2018! Caitlin was the rockstar; she kept working until the very end and got it finished to submit. I admit I was pooped by Friday evening, so I wasn't much help the next day. However, I was able to make a good video that demonstrated RanKit, so I feel good about that. The other undergraduates who kept working on this project into D term were also really good. They fixed bugs at the last minute and made it happen. Overall, I feel very happy with the way things turned out. I don't want to talk too much more about the paper, since it is under double blind anonymous review right now. Next week, we will meet and talk about tasks for the next few weeks of the term. I'll write about that in my next post. Cheers!

Week 27-28: Writing

March 28, 2018

Hello all. The past two weeks since returning from spring break, I have been working with the grad student to write the vis paper. It is due next week on Friday. A couple of the undergraduates (along with my CREU peer, Diana) have stayed on to make changes to the RanKit application to make it better for publication. Briefly, I'll share a couple things I wrote in the past two weeks that have gone into the paper. First, I'll share some equations that describe the amount of information collected from the different build tools for each unit of user effort. Second, I'll give three use cases that describe why these tools might be used. Equations The ranking algorithm is based on pairwise comparisons. We will assume that the user's input is meaningful and that there exist no cycles in their rank (e.g. user says Toy Story 1 > Toy Story 2, Toy Story 2 > Toy Story 3, and Toy Story 3 > Toy Story 1). The collection of all possible rankings is a directed acyclic gra...

Week 26: On Break

March 09, 2018

I am away from school this week on Spring Break, I will continue with progress next week.

Week 25: Plan for new interactive model

March 09, 2018

Last week, I wrote about our new plan to write a visualization paper based on the IQP team's ranking application. At conferences like VAST, there are several different types of papers one can submit, and our first challenge was deciding where to pigeonhole our application. After talking with Professor Harrison, a notable vis expert in our CS department, we determined that the novelty in our project is the user interface. As such, our paper would fall into the category of systems or design study. According to Processes and Pitfalls in Writing Information Visualization Research Papers , a sort-of guidebook to the world of writing vis, systems papers focus on architecture choices and the design of abstractions. Design study papers, in contrast, demonstrate a new visual design to solve a problem, often using existing algorithms or techniques. If we consider ours to be a systems paper, we would probably focus on explaining the pipeline (from choosing a dataset to building a ranking from...

Search This Blog

MaryAnn's Blog

Posts

Week 32-33: Wrapping up

Week 31: Datasets for fairness

Week 30: Returning to Fairness

Week 29: We submitted!

Week 27-28: Writing

Week 26: On Break

Week 25: Plan for new interactive model