Social Groups and Class Divides

In India, there is a difference between political and economic debates around inequality. In politics, the debate is often in the context of social groups which have historically been poorer off as compared to the rest of the society – dalits and tribals for example, and from the early 90s, Other Backward Classes (OBCs). In economics, the debate is framed more in terms of income groups – those above and below different levels of income. How can we bring these two different categories (social groups on the one hand, and class on the other) together?

The chart below shows how households of a social group, say dalits, are distributed across ten different levels of consumption expenditure (a proxy for income), from poorest to richest (left to right), in a state as of 2010. Within each state and social group, I’ve further split the population into rural and urban (use the drop-down box at the bottom of the chart to switch). Hover your mouse over any one of the groups in a state to display a popup with more detail. So, for instance we note that around 18% of tribal households in rural India as a whole are concentrated in the lowest consumption group. The other charts in the same row give the comparable numbers for dalit (13%), OBC (10%) and ‘other’ households (5%).

(Use the drop down box to switch between rural and urban areas. Hover your mouse over a cell to see the data in more detail and for a comparison of rural and urban areas. Cells without data in them indicate that group is too small – 5% or below of the overall population to be included)

There are many interesting bits of information here. For rural India as a whole, dalit and tribal households are concentrated in lower consumption groups (as we would expect). OBC households are much more evenly divided across all consumption groups. In general, inequalities are sharper in urban areas. And note the shape of the graph of the ‘other’ category – in almost all cases it is upward, showing that a disproportionate chunk of that population is concentrated in the upper consumption groups.

But patterns across individual states are more interesting. In some of the richer states such as Punjab and Haryana, inequalities are sharply pronounced in the case of dalit households– more so than in other states. In Delhi, and in urban areas of Maharashtra and Karnataka, we can see a pronounced concentration of dalits in the middle groups. Ditto for rural Gujarat. In urban Tamil Nadu, a larger chunk of the ‘other’ group is concentrated in the ‘richest’ consumption bracket, as compared with the national average. Note the difference in the shape of the graphs of the tribal households in rural areas of Madhya Pradesh and Chattisgarh.

But so what? It’s quite tempting to draw conclusions about politics from this. So we could argue for instance, that the sharper inequalities in the dalit population of rural Punjab and Haryana, as compared to say, UP, is because of the much more visible role of the dalit movement in politics in the latter. This is interesting to do but beyond a point it’s pure speculation and it’s not true in all cases.


If a cell or graphic is blank, it means that particular group is too small to be included (5% or below of the relevant population).

How are these graphs constructed? We start by taking the entire set of rural/urban households in a state (dalits, tribals etc) and putting them in one common bucket. We then split this population up into ten equally sized groups, according to consumption expenditure, and within each group, count the number of dalit households, tribal households, OBCs etc. Each bar in each graph then represents the share of that social group in a state in that consumption bracket.

It’s important to note that these inequalities are relative to a state. So an OBC or dalit in the lowest income group in Punjab for instance, may still be better off in absolute terms, than a dalit in the same income group in Rajasthan.

One caveat about the ‘other’ category. It also includes a large chunk of Muslim households, which historically has been poorer off than the rest of the population. So we must be careful about concluding that all the population in the ‘other’ category is more privileged than other social groups.

The data are from National Sample Survey data on consumption across social groups (NSSO report 544 for 2009-10, Tables 2R and 2U).

Graphs made using D3.

Exploring Spikes in Onion Prices

This post charts onion prices at the wholesale level from 2006 up to last month. As you can see, it’s a bit of a wild ride.

(Note : made slight edits to following para)

I’ve also added data about total market arrivals  (the volume of onions reaching wholesale markets for further distribution) across the country. And while there are dozens of wholesale markets from which onions are distributed to retailers, there are roughly eleven markets, which accounted for about a third of all market arrivals since 2006. Data for arrivals in these markets are broken out separately (in terms of percentage of total arrivals).

(Prices are in rupees per quintal, total arrivals are in tons, and arrivals in the largest markets are computed as a percentage of arrivals across the country. Use the drop down boxes to overlay different pieces of data against each other and the smaller graphic to pan or zoom into a particular part of the data set. See notes for a description of the data)

What do we learn from these charts? Here are a few very rough impressions.

There were about nine price spikes since 2006. In each case, prices return to more or less the level they were at before the spike began with no longer term trend that’s visible (at least to the naked eye – a proper analysis would involve some serious statistical heavy lifting to verify trends etc). The latest spike is the sharpest one. Even as I write though, prices have begun to drop. It’s not visible in the chart which ends around mid-September.

While there’s a widespread impression that spikes in onion prices happen towards the end of the year, it’s actually not as clear cut. Here are the very rough dates at which prices peak in each cycle – Feb 2007, July 2007, August 2008, December 2008, November 2009, December 2010, August 2011, January 2013, September 2013.

If you overlay market arrivals with prices, you notice an interesting pattern – price spikes coincide with an increase in market arrivals, and not a decrease as you might expect. This is not true for all cases, but it is true often enough.

In almost every case, a price spike also coincides with an increase in the share of the top markets in total arrivals. And a price dip coincides with a decrease in the share of the top markets.

Put together these observations and what do we have? What follows is one possible explanation of the facts and not the only one, and I’m no expert.

The beginning of a price spike coincides with a period when arrivals flood into the larger markets such as in Bengaluru or in the Pune-Nasik belt. But crucially,smaller markets, which act as final distribution centres for retail consumers, don’t see the benefit as yet. Hence the phenomenon of a rise in arrivals, but also a rise in the share of arrivals into larger markets. This continues for a couple of months, during which mandi prices soar.

There comes a time when larger mandis begin disgorging onions towards the smaller ones. This coincides with the beginning of a sharp dip in prices, as well as a dip in the share of the largest markets in total arrivals.

The Competition Commission had some decided views on this whole process. For a related story see this.

The latest price spike though, seems different from most of the previous episodes. It coincides with a very severe, sharp drop in both total arrivals in all markets, and in the biggest markets. In that sense it seems similar to the previous highest spike we saw in late 2010-early 2011. Then too, total arrivals  and arrivals in the largest markets as a percentage of the overall saw a sharp drop. (Edit : But arrivals in the largest markets as a share of the overall, rose at that time)

The one big missing link in all of this is of course, retail prices and how they behave as compared to the wholesale data. I’ll try and blog more about different aspects of onion markets in the coming weeks. Hopefully.

Further Edit : Showing this data on a log scale is also worth doing. Will put that up in a few days.


The price I’ve used here is the average modal price across all mandis, weighted by arrivals, in the respective mandi for each day (quantity weighted average). I then computed a 30-day centred moving average of these prices to iron out the ‘noise’. Arrivals data are also 30-day moving averages.

The ‘large markets’ considered here are : Bengaluru, Azadpur (Delhi), Bhavnagar, Pune, Lasalgaon,Varanasi, Mumbai, Yeola, Vashi, Pimpalgaon, Mahuva.

The data was downloaded from here. You can download the raw data, in all its noisy, un-averaged glory from here. (Warning : this is a 51 MB csv file when unzipped. You don’t want to be trying to open this on excel on a slow machine)

It’s important not to confuse the term ‘arrivals’ with ‘production’ or ‘output’. There’s double counting in the arrivals data. For instance, a unit of onions could be counted once when it arrives in Bengaluru Mandi, and then counted again, when it is sent from there to Azadpur in Delhi.  It’s best to treat ‘arrivals’ as some sort of proxy for availability of onions for distribution, whether to the wholesale trade or into the retail channel.

Charts made using D3.

The Toilet Map Done in a (Slightly) Different Way

Thanks very much for all the positive comments on the toilet map in the previous post. But one thing bothered me about it and I thought I would fix that.

The issue comes up whenever you use maps to represent social or economic data and it has to do with the rather obvious fact that maps represent parts of the surface area of the earth.

Take the case of Ladakh and Mumbai. In the previous map, Ladakh (represented by the big area right at the top) has a very small population but relatively large surface area. Mumbai is exactly the opposite. So Mumbai, with around 20 million residents, shows up as a small speck on the map, but Ladakh with a population nowhere close to that of Mumbai’s, shows up as the biggest part of the map. Visually, Ladakh dominates, because it covers a huge area, though in terms of the size of its poulation (which is more relevant when we are talking about toilets or cellphones or whatever), it really shouldn’t. The tehsil of Jaisalmer in the far west of the country is another example – large surface area but relatively sparsely populated.

Geographers have of course known this problem for long, and have tried various ways to fix it.  The map below is one way to do so. Each tehsil is now just a dot of the same size on the map, irrespective of area. Now, no district is more obvious than any other, purely by virtue of the fact that it covers a bigger area. I also changed the colour scheme just for the heck of it. Everything else is unchanged.

(Apologies but this post may take a while to load on a slow connection, and/or if you are using a tablet or phone. Click or tap on the map to switch between 2001 and 2011. Hover your mouse over a tehsil to see its details. The greyed-out regions are those for whom data couldn’t be compiled, or those which are not relevant e.g. PoK)

There are still other problems  – the eye seems naturally drawn to Bihar, West Bengal and northern Andhra purely because the tehsils there are much smaller. So the dots cluster much more closely together giving rise to a continuous band of colour in those regions. But on the whole, I think this is a better way to do it.

Edited to add: Even this map is a second best solution. What I should do is size each dot to reflect population density in that area. So for instance, Mumbai would show up as a larger circle than Jaisalmer or Ladakh. But the technical challenge of creating such ‘cartograms’ are beyond me at this stage.

Thanks to a reader who pointed out that the ‘tooltips’ – which show details of each area when you hover the mouse over them – don’t work in Firefox browsers for some reason. I have no clue why. I’m working on it, but in the meanwhile, you are probably better off using Chrome. Sorry about that.

Notes :

Please see the previous post for the full set of notes. Map created using the D3 javascript library and colours are from colorbrewer.

« older posts newer posts »