MeFi Post Recommendation Engine
January 24, 2018 9:50 AM Subscribe
MeFi Post Recommendation Engine
Just finished building a content recommendation engine for MeFi using natural language processing and non-negative matrix factorization techniques! It produces a list of post recommendations based on a user history of posts, comments and favorites. It can also make recommendations based on a piece of text, so for example, you could paste a particular post and it will return a list of other posts that have some similar characteristics. I hope you enjoy playing around with it! Please let me know what you think. Here's more info in case you're interested (: https://github.com/tomasbielskis/metafilterpostrecommender
Just finished building a content recommendation engine for MeFi using natural language processing and non-negative matrix factorization techniques! It produces a list of post recommendations based on a user history of posts, comments and favorites. It can also make recommendations based on a piece of text, so for example, you could paste a particular post and it will return a list of other posts that have some similar characteristics. I hope you enjoy playing around with it! Please let me know what you think. Here's more info in case you're interested (: https://github.com/tomasbielskis/metafilterpostrecommender
Role: Developer
Thank you! You are making a very insightful observation about the fact that people's comments don't necessarily reflect what they like to read. Arguably, that doesn't completely rule out the comments as a source of signal but I definitely agree with you. There are ways to fiddle with this, a trivial one would be adding a lower weight to coefficients derived from the comments but for this prototype I haven't done much optimization like that yet.
Ideally, what would make the quality of recommendations better, would be user behavioral data of what posts they actually read or click on but that's not public if MeFi even tracks that, so I was trying to make the most of the data that was available.
There are definitely weird and surprising things that pop out of the recommendation set right now like the test post you linked to. Short posts are hard to deal with since there's very little information that can be derived from them but I didn't want to remove them altogether since a lot of posts here are short yet relevant.
I kind of wanted to keep the recommendations a bit raw and unmanicured because I think that adds something to the discovery process; it's very tempting and easy to overcustomize and end up with an outcome that's nothing but stale and boring.
One rather bizarre outcome with my recommender is this post about the helicopter cat that it thinks everyone should read: (the link in the post is broken but here's one that works). I mean who on MeFi wouldn't want that recommended to them?
posted by tomasbielskis at 11:25 AM on January 25, 2018
Ideally, what would make the quality of recommendations better, would be user behavioral data of what posts they actually read or click on but that's not public if MeFi even tracks that, so I was trying to make the most of the data that was available.
There are definitely weird and surprising things that pop out of the recommendation set right now like the test post you linked to. Short posts are hard to deal with since there's very little information that can be derived from them but I didn't want to remove them altogether since a lot of posts here are short yet relevant.
I kind of wanted to keep the recommendations a bit raw and unmanicured because I think that adds something to the discovery process; it's very tempting and easy to overcustomize and end up with an outcome that's nothing but stale and boring.
One rather bizarre outcome with my recommender is this post about the helicopter cat that it thinks everyone should read: (the link in the post is broken but here's one that works). I mean who on MeFi wouldn't want that recommended to them?
posted by tomasbielskis at 11:25 AM on January 25, 2018
Thought it was broken but it's just case-sensitive, which makes sense. Neat.
posted by Wretch729 at 7:20 PM on February 1, 2018
posted by Wretch729 at 7:20 PM on February 1, 2018
Is there a secret reason why this post seems to rank high for any username I put in?
posted by Wretch729 at 7:24 PM on February 1, 2018
posted by Wretch729 at 7:24 PM on February 1, 2018
Haha, yeah, I pointed that out above as well... Apparently, that's the post with the most universal appeal in this community!
posted by tomasbielskis at 10:00 AM on February 6, 2018
posted by tomasbielskis at 10:00 AM on February 6, 2018
« Older Weird One Character Domain Superstore... | Thicket.io: a tool for a post-... Newer »
One downside of using text written by the user instead of favorites (which you explained were too sparse, unfortunately) is that people sometimes comment on things that anger them while quietly enjoying things they like. So, these are more like recommendations for posts that are like the kind of things you write on MetaFilter. I have some recs in there that I could see myself having talked about in the past, but absolutely do not think I'd feel satisfied with reading.
Something neat you could do with these post vectors you calculated is to take all the posts that a user faved, then cluster them. It'd be pretty neat to see what my clusters are and what's in them.
posted by ignignokt at 1:24 PM on January 24, 2018