Nora Holder, 2024

Forcing a Machine Learning Program to Pick My Outfits,
Because I’m Too Indecisive.

A majority of the population has some routine. Your alarm goes off at some time between 5 and 7 AM. You might workout in the morning and shower after, maybe you shower in the evening. There’s tons of variables, but one thing we all are required to do is figure out an outfit. For tons of people this is nothing major: random shirts, random pants, random shoes. This is a simple process until you consider other factors; what's the weather? Are you going to work? What clothes are comfortable for the occasion, are they ready? It’s menial to some, but I’ve noticed how much of a damper it can be on my routine. I wake up at 6, do all my showering & stuff by 6:45, and then I have till 7:50 to pick an outfit, put on makeup, and leave. I have been a few minutes late at least once a week because of the fact that I don’t know what I want to wear, or what I’ll like to wear– so what happens when I take that choice away from myself?

To reiterate the problem: I need an optimized program that I can give it a few attributes, and it hands me an outfit (or multiple outfits). With this I need to decide what the attributes are, how to value them, and how to account for if a user is indifferent about a given attribute. This also includes an easy way to insert this within a given dataset that isn’t manual (e.g. a secondary program).

And the solution is just that– two programs: newfit.py, a program designated to take new outfits and put them into two datasets (the purpose of which will be discussed later), and generatefit.py, a program which lets you generate at minimum 2 outfits, one from each set, from a given set of variables.

The main concept driving the reason that this decision is so paralyzing is often attributed to executive dysfunction, but this does not account for why these decisions are so paralyzing– that’s where the paradox of choice appears. In a study done in 2009, Antti Oulasvirta, Janne P. Hukkinen, and Barry Schwartz participants were tested on how they responded to extended search results versus a shorter list. Those with 6 choices were able to complete the task faster and with significantly more satisfaction than their peers with 24 items. [1] Most of the concepts around choice paralysis function primarily in economic studies, such as the study done by Kurien et. al. which was able to observe and conclude that extended choices for a consumer lead to delays on purchases [2], in our case it would be a delay in a decision of clothing, or any general decision. With this we can return to this feeling of guilt and dissatisfaction with being given a result by their own choice– our fear to “chose wrong” becomes a deterrent in most eyes, as the reward value per added choice does not increase at a rate that outweighs the rate in which the lows of guilt consequence increases, creating a net loss[3]. Since this project is considered a niche (and personal) project, there was no previous work that would contribute to direct knowledge on this specific concept that is “pick outfits for me”.

Fashion– rather I personally find all of these outfits lovely. I’m sure others have higher praises or criticism, and some may find these downright ridiculous. I am not a fashion major, nor do I plan to become one for the sake of this project.

3.1 Questions & Overview

There are a plethora of ways one could approach this model. The questions I’m asking is (1) What attributes I could assign to these outfits to be contained in a dataset and (2) in which cases would we count a “positive” return on a program run. After observing a handful of my personal outfits along with general fashion, I decided to reduce my attributes down to 7 values. (1) What was the “style^[1]” The outfit is (2-4) The outfits weather without a jacket, with a light jacket, and with a heavy jacket, (5) the predominant color, (6) the occasion/event, and (7) the comfort. Every insert to the dataset will include a label, which is a brief description of the main outfit attributes: shirt, pants, shoes^[2]. In the cases where maybe two attributes on a given outfit could fit interchangeably (maybe an outfit was everyday and also a party outfit), two outfit insertions would be needed (one for each). One of the biggest concerns with this project is the fact that while it is a classification project, it’s very rare that there would be a double entry for any given item (excluding in cases mentioned previously).

3.2 Datasets & Machine Learning Model

The solution I found was to give two outputs (a very minimal choice), one from the original dataset, with only exact values. The full dataset accounted for variants which had “Indifferent” for several attributes, giving more flexibility for an outfit. These would both be read into a system, creating a decision tree model using python's sk-learn library, train one on each dataset, and return a prediction (decision) for each set. Consider the following example. We input the given requested attributes and get the following results, along with my personal decisions on if I would wear them on this given day.

3.3 Evaluation

The concept of measuring human desire/goal is flawed at best, so for evaluation purposes the two datasets will be evaluated on the same responses with a point system. There will be two runs of said system: one of which values definite answers significantly more (5 for yes, 2 for maybe, 0 for no), while the other model will focus on a possible outfit versus a definite outfit (1 for yes/maybe, 0 for no). Due to simplicity on a low scale data model, only 30 tests were run (a month's worth of outfits). To emphasize, a perfect score on variant 2 would be 150 points, compared to variant 1 which would be 30 points. The tests included one that was simply 7 indifferents, seeing what possible outfit would be given with 0 true requests. It could be assumed this gives fullDataset an upper hand, as clothes.csv has no indifferent categories. There were also concerns in regards to prompts without a set that fit the request “exactly”, that neither dataset would get a clear advantage from. It is also important to note that we will not be taking advantage of the option to give multiple choices for a given prompt. This method will not be evaluated due to the concerns of time and reasonable evaluation models.

4.1 Initial Observations & Concerns

Another crucial note is none of these would be considered “perfect”, as their accuracy does not fall near 95%. This could be for several reasons. For starters, the test set generated did not guarantee that a value that had the exact match of attributes existed within either dataset, which could have a negative skew on the data, but would also be an example of user error in which a user has not inputted an outfit with the given requests, and thus an innacture response was returned.

outcome. The refined dataset only returned two inaccurate values, which were entirely inaccurate. The full dataset returned 27 viable items, but 4 of them were “maybe” values, rather than all “yes” responses. The margin is still Minimal, which justifies the output of models trained on both datasets for every program execution.

Considering the small scope of a personal project, this would be considered a working success. The base “clothes.csv” file only contains 200 values (and the fullDataset is around 13000), and as time passes and the dataset is given diversity, there’s a likelihood that future accuracy tests would report over 95%. There’s also considerations on if this model was not as optimal as using any other given model. I would advise against a Neural Network, due to the sheer size of the fullDataset, but if you have the time it could be considered. I could see an adjustment to a random forest model instead to test for greater accuracy, or creating a model with 3 labels (although this would require a remade dataset from the ground up).

Overall, the program satisfies the objective, even if on occasion it may require multiple runs and/or attribute selections. Even with this, a rather daunting (but basic) choice can be reduced from tens of combinations to less than 5. The paradox of choice in most cases will not be entirely extinguished, but extremely mitigated– and if that speeds up your routine, that’s all that matters.

Antti Oulasvirta, Janne P. Hukkinen, and Barry Schwartz. 2009. When more is less: the paradox of choice in search engine use. In Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval (SIGIR '09). Association for Computing Machinery, New York, NY, USA, 516–523. https://doi.org/10.1145/1571941.1572030

Rony Kurien, Anil Rao Paila, Asha Nagendra, Application of Paralysis Analysis Syndrome in Customer Decision Making, Procedia Economics and Finance, Volume 11, 2014, Pages 323-334, ISSN 2212-5671, https://doi.org/10.1016/S2212-5671(14)00200-7.(https://www.sciencedirect.com/science/article/pii/S2212567114002007)

[1] Note: style– and all of the given attributes– are subjective, every person's dataset will be personalized.

[2] Various extras like jewelry, makeup, etc. are significantly easier to choose once given an outfit, so they are not included for the sake of simplicity.

[3] Accuracy was based 75% on the first variant of testing, basing score out of 30, and 25% on the second variant, basing a score out of 150.

Label								Would I Wear?
clothes.csv: “W_Shirt_Cr_Slacks_Y_Docs”								Maybe
fulldataset.csv: “Y_BDown_Bl_Slacks_Y_Docs”								Yes
No Relation ‎ Partial Relation Correct Relation

Surprisingly the results did not give significant differences between each variant's accuracy with a 3% margin. It is important to note that while Variant 1 gave less guaranteed outfits (either a yes or a maybe), all of the outfits were guaranteed yes values.	Dataset	Variant 1	Variant 2	“Accuracy”^[3]
	clothes.csv	22/30	110/150	73.33%
	fullDataset.csv	25/30	83/150	76.33%

On a rerun of our program using exclusively examples found within the set, the data displayed a significantly better	Dataset	Variant 1	Variant 2	“Accuracy”
	clothes.csv	28/30	140/150	93.33%
	fullDataset.csv	27/30	123/150	88%

Introduction

Related Concepts & Material

Problems & Procedure

3.1 Questions & Overview

3.2 Datasets & Machine Learning Model

3.3 Evaluation

Results

4.1 Initial Observations & Concerns

Discussion