I recently reached a milestone on Stack Overflow. I have answered 300 questions (and most are answered correctly 😋) on the [accessibility] tag.
It is part of a personal challenge I set myself nearly two years ago.
Since September 2019 I have tried to give the best answers I can on the [accessibility] and [pagespeed-insights] tags on Stack Overflow, linking with my two passions in web development, load speed and inclusivity.
I find answering questions to be the best way to learn (and reinforce any imposter syndrome or conversly, make sure I don't suffer from the Dunning Kruger effect! hehe.
Anyway, I know I have been very active answering on those tags but the narcissist in me wanted to know how I compared to everyone else.
Luckily Stack Overflow has a great feature that few people seem to use / know about Query Stack Overflow (https://data.stackexchange.com/)
It allows you to analyse the Stack Overflow database and grab some interesting statistics / info.
So here are a couple of queries I put together so I could analyse my progress:
Number of comments, questions and answers ranked by number of answers
This query took me ages to put together, referencing several other queries other people had writeen to piece it together. Put it like this, it really highlighted my poor SQL skills! 🤣
It allows you to grab all of the questions, answers and comments on a given tag, ranked by number of answers given initially.
The query
DECLARE @tagName varchar(255)
SET @tagname = rtrim(##TagName:string##)
DECLARE @start varchar(255)
SET @start = rtrim(##Start:string##);
DECLARE @end varchar(255)
SET @end = rtrim(##End:string##);
WITH questsByTags AS (
SELECT DISTINCT
q.Id
, q.OwnerUserId
FROM Posts q
INNER JOIN PostTags pt ON q.Id = pt.PostId
INNER JOIN Tags t ON t.Id = pt.TagId
WHERE q.PostTypeId = 1 -- questions
AND q.CreationDate > @Start
AND q.CreationDate < @End
AND (
t.TagName = @TagName
)
),
answersByTags AS (
SELECT
a.Id
, a.OwnerUserId
FROM Posts a
INNER JOIN questsByTags qbt ON qbt.Id = a.ParentId
),
commntsByTags AS (
SELECT
c.Id
, c.UserId AS [OwnerUserId]
FROM Comments c
INNER JOIN (
SELECT Id FROM questsByTags
UNION ALL SELECT Id FROM answersByTags
) AS allPosts
ON allPosts.Id = c.PostId
),
allUsers AS (
SELECT OwnerUserId FROM questsByTags
UNION SELECT OwnerUserId FROM answersByTags
UNION SELECT OwnerUserId FROM commntsByTags
)
SELECT au.OwnerUserId AS [User Link],
u.DisplayName
, (SELECT Count (qbt.Id) FROM questsByTags qbt WHERE qbt.OwnerUserId = au.OwnerUserId) AS [Num Qsts]
, (SELECT Count (abt.Id) FROM answersByTags abt WHERE abt.OwnerUserId = au.OwnerUserId) AS [Num Ans]
, (SELECT Count (cbt.Id) FROM commntsByTags cbt WHERE cbt.OwnerUserId = au.OwnerUserId) AS [Num Cmmnts]
FROM allUsers au
LEFT JOIN users u
ON u.id = au.OwnerUserId
WHERE au.OwnerUserId IS NOT NULL
ORDER BY [Num Ans] DESC, [Num Cmmnts] DESC, [Num Qsts] DESC
The permalink so you can try it yourself
Inputs
- Tag - The Stack Overflow tag you want to see rankings for
- Start Date - When to query from
- End Date - When to query to, can be set to a future date if you want it to be "until today".
Outputs
- User Link - a link to their profile (when exporting to CSV outputs the userID
- User Display Name - the user name, purely so it can be exported to CSV
- Number of questions - how many questions that person asked with that tag
- Number of Answers - number of answers that person gave with that tag on the question
- Number of Comments - the total number of comments across questions and answers given by that user.
Results
For the tag [accessibility] running from 2019-12-31 to 2020-21-31 (first 20 rows)
User Link | DisplayName | Num Qsts | Num Ans | Num Cmmnts |
---|---|---|---|---|
2702894 | Graham Ritchie | 1 | 188 | 512 |
1971216 | QuentinC | 0 | 32 | 26 |
4540141 | Adam | 0 | 28 | 22 |
4274933 | Josh | 0 | 22 | 32 |
469491 | brennanyoung | 0 | 17 | 22 |
3825084 | XLE_22 | 0 | 10 | 15 |
1226227 | michaelpuckett | 0 | 5 | 0 |
3147711 | Alex Walczak | 0 | 5 | 0 |
4597840 | Phil Weaver | 0 | 4 | 2 |
10838693 | SAURABH | 0 | 4 | 1 |
4873295 | ShellZero | 2 | 3 | 8 |
14129711 | Unbywyd | 0 | 3 | 3 |
12252274 | Poli97 | 7 | 2 | 12 |
5587356 | Super Jade | 0 | 2 | 10 |
6002174 | Tsundoku | 0 | 2 | 6 |
8318731 | Lionel Rowe | 0 | 2 | 5 |
7406840 | Stefany Newman | 1 | 2 | 4 |
4733161 | cloned | 0 | 2 | 4 |
5186515 | nonoandy | 2 | 2 | 3 |
608042 | Andy | 1 | 2 | 3 |
For the tag [pagespeed-insights] running from 2019-12-31 to 2020-21-31 (first 20 rows)
User Link | DisplayName | Num Qsts | Num Ans | Num Cmmnts |
---|---|---|---|---|
2702894 | Graham Ritchie | 1 | 58 | 152 |
14063860 | Sham | 0 | 9 | 1 |
1710628 | Ramesh Elaiyavalli | 0 | 5 | 0 |
3151817 | addyo | 4 | 4 | 1 |
5585371 | Ferran Buireu | 0 | 2 | 5 |
12242328 | Grégoire | 0 | 2 | 0 |
8698671 | Ozik Jarwo | 0 | 2 | 0 |
11303070 | oreoorbitz | 0 | 2 | 0 |
6331353 | Sam | 6 | 1 | 22 |
10990737 | Ifaruki | 0 | 1 | 7 |
14273874 | Thomas | 1 | 1 | 2 |
2441103 | David Lorenzo López | 1 | 1 | 2 |
218980 | szymond | 1 | 1 | 2 |
1742382 | stylesuxx | 0 | 1 | 2 |
6820056 | Abhishek Jain | 0 | 1 | 1 |
681548 | keul | 0 | 1 | 1 |
4425964 | hiew1 | 0 | 1 | 1 |
731631 | Erik | 0 | 1 | 1 |
1177814 | Dipen Shah | 0 | 1 | 1 |
1237494 | Samar Panda | 0 | 1 | 1 |
As you can see the narcissist in me is happy with those results (I am Graham Ritchie, so now you know my real name 😋).
Not only top of the board but by a decent margin, I consider that challenge complete!
Every single answer given on a particular tag by a user
The other thing I wanted was a word count for a given tag.
Now because I wanted to make this a useful general query there were a few other things I wanted.
I wanted to know if an answer given was the accepted answer (to work out accepted answer rates).
I also wanted to be able to work out the total number of points given from a particular tag between particular dates (luckily the accepted answer combined with the "score" column is sufficient for this).
I wasn't clever enough to work out how to do a word count within the query (or if it is even possible, which I doubt) so I just wanted to grab all of my answers so I could process them elsewhere.
The query
DECLARE @UserID varchar(255)
SET @userid = rtrim(##UserID:string##)
DECLARE @tagName varchar(255)
SET @tagname = rtrim(##TagName:string##)
DECLARE @start varchar(255)
SET @start = rtrim(##Start:string##);
DECLARE @end varchar(255)
SET @end = rtrim(##End:string##);
select p.*, q.Title AS QnTitle, q.Body AS QnBody, q.AcceptedAnswerId AS AcceptedID, q.tags AS tags,
CASE
WHEN q.AcceptedAnswerId = p.Id THEN '1'
ELSE '0'
END AS isAccepted
from posts AS p
INNER JOIN posts AS q
ON p.parentID = q.id
INNER JOIN PostTags pt
ON q.Id = pt.PostId
INNER JOIN Tags t
ON t.Id = pt.TagId
where p.owneruserid=@UserID
AND t.TagName = @TagName
AND q.CreationDate > @Start
AND q.CreationDate < @End
AND p.PostTypeId = 2;
The permalink so you can try it yourself
Inputs
- UserID - the user ID you want to query (can be found by going to a persons profile on SO)
- Tag - The Stack Overflow tag you want to see all answers for
- Start Date - When to query from
- End Date - When to query to, can be set to a future date if you want it to be "until today".
Outputs
- Score - the score is the cumulative up / down votes
- Tags - all tags on the original question
- isAccepted - whether this answer was accepted by the OP (for some strange reason you may have to run the query twice for this to show up!)
- QnTitle - the original title
- QnBody - the original question
- Body - the body of the answer
- A few others as I was lazy and just grabbed the entire entry for the answer
Results
I am not going to show the results here due to the number of columns. Instead run the query yourself for your own user ID.
Conclusion
The two queries above let you analyse your Stack Overflow contributions in some quite interesting ways.
For example I found out I had written over 250,000 words in total last year on Stack Overflow! That is quite scary really (it is one of the reasons why I am starting to write on Dev.to instead so I get some benefit from writing so much!)!
By running the query month by month you can build up an idea of your contributions to a tag over time (something I am still working on).
Living in a "niche" lets you rank well
Obviously I hang around in the quiet tags so it is easy to be top of the board, but for you you might find you are close to the top of the board on some more active tags and decide to put some extra effort in to reach the top!
Or, on the flip side, the queries above might be useful for trying to find people for advice.
Obviously it isn't a guarantee of getting the best person but being high up the rankings does at least give you an idea of whether they are reasonably knowledgeable on a subject.
Do you have any fun queries to share?
I like analysing data, I am not very good at it though!
So does anyone have any interesting queries on data.stackexchange.com they have created or stumbled across? If you have then let me know in the comments!
Maybe you have a query that works out which tags are most popular? Or which user writes the most comments on the site? Whatever it is be sure to share it!
Final question - are you near the top for any tags?
Is there a tag that you perform particularly well on? If so share a link to the query and your Stack Overflow ID / Display name so we can see where you shine!
Don't be afraid of showing off, as you can see this post is half useful, half a way to brag a little bit! 🤣🤣
For those of you who are awaiting my next mad web experiment, the next one is good fun...."I built a fully functional SPA website using a single DOM element, no CSS and a tonne of JS!"...it took me ages but is nearly finished...coming soon™(It is live)