Note that there are some explanatory texts on larger screens.

plurals
  1. POAdding total counts as horizontal lines to histograms in facet_grid()
    primarykey
    data
    text
    <p>Data: </p> <p>I have a data frame comprising 4 variables and about 300k rows including a unique account ID, a start date in yyyy-mm-dd, a start year, and the total number of months to-date the customer has held an account active. Snippet of the data below (don't let the row numbers confuse, this is obviously a subset, if more data is necessary, let me know):</p> <pre><code> &gt; head(ten.by.id) acct.id start_date strt.yr max_ten 1 155 1998-11-01 1998 175 19 902 2001-09-01 2001 143 39 995 2001-09-01 2001 143 59 1014 2000-10-01 2000 153 78 1017 2000-04-01 2000 160 100 1137 2000-11-01 2000 153 </code></pre> <p>Problem (Why I want to render a faceted plot): </p> <p>Showing a histogram of the entire dataset across all years renders the following:</p> <p><img src="https://i.stack.imgur.com/l8sQd.png" alt="Frequency histogram of Tenure by # of Customers"></p> <p>Obviously, there are mixed distributions of information here, but the effect is unknown. First I thought I'd check for time domain effects with a visual. By using facets, I can provide a serial histogram of frequency distributions by year, overlaying the KDE plot for each year. </p> <p>If multiple distributions were a product of something that occurred over time, I could spot check relevant shape changes (i.e. uni to multimodal). I used the code below to generate this plot:</p> <pre><code>maxten_time &lt;- ggplot(ten.by.id, aes(max_ten)) + geom_histogram(colour="grey19", fill="orange", binwidth=2, stat="bin") + scale_y_continuous(breaks=seq(0,12000,by=100)) + scale_x_continuous(breaks=seq(0,180,by=45)) + labs(title ="Serial Distribution of Max Length of Tenure for all Customers by Start Date", x="Max Tenure(months)", y="# of Customers", colour="blue") + facet_grid(. ~ strt.yr) + geom_density(fill=NA, colour="orange", cex=1) + aes(y = ..count..) </code></pre> <p>Which renders the following:</p> <p><img src="https://i.stack.imgur.com/agQcw.png" alt="enter image description here"></p> <p>Questions for recreating the faceted plot: </p> <ul> <li><p>What I wish to do is add a horizontal line (or some other single marker) to each facet which indicates the total # of customer starts for each year. Can this be done in a faceted plot?</p></li> <li><p>I would like to add an additional axis that spans across the facets to<br> mark the number of months across all years (1 to 175). Am I reaching with ggplot to try to do this (i.e. since each facet is its own plot, would aligning the month markers across all facets even be possible)? I haven't seen any relevant examples on doing something quite like this. </p></li> </ul> <p>The objective is merely to combine the horiz lines in each facet and the axis across facets into the entire plot. Any direction would be helpful.</p> <p>Phillip</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload