Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <pre><code>import pandas as pd def promotion(ls): return (ls.diff() &gt; 0).sum() def growth(ls): return ls.iloc[-1] - ls.iloc[0] jobData = pd.DataFrame( {'candidate_id': [1, 2, 2, 2], 'TitleLevel': [2, 1, 2, 1]}) grouped = jobData.groupby("candidate_id") titlePromotion = grouped["TitleLevel"].agg(promotion) print(titlePromotion) # candidate_id # 1 0 # 2 1 # dtype: int64 titleGrowth = grouped["TitleLevel"].agg(growth) print(titleGrowth) # candidate_id # 1 0 # 2 0 # dtype: int64 </code></pre> <hr> <p>Some tips:</p> <p>If you define the generic function</p> <pre><code>def foo(ls): print(type(ls)) </code></pre> <p>and call</p> <pre><code>jobData.groupby("candidate_id")["TitleLevel"].apply(foo) </code></pre> <p>Python will print</p> <pre><code>&lt;class 'pandas.core.series.Series'&gt; </code></pre> <p>This is a low-brow but effective way to discover that calling <code>jobData.groupby(...)[...].apply(foo)</code> passes a <code>Series</code> to <code>foo</code>.</p> <hr> <p>The <code>apply</code> method calls <code>foo</code> once for every group. It can return a Series or a DataFrame with the resulting chunks glued together. It is possible to use <code>apply</code> when <code>foo</code> returns an object such as a numerical value or string, but in such cases I think using <code>agg</code> is preferred. A typical use case for using <code>apply</code> is when you want to, say, square every value in a group and thus need to return a new group of the same shape. </p> <p>The <code>transform</code> method is also useful in this situation -- when you want to <em>transform</em> every value in the group and thus need to return something of the same shape -- but the result can be different than that with <code>apply</code> since a different object may be passed to <code>foo</code> (for example, each column of a grouped dataframe would be passed to <code>foo</code> when using <code>transform</code>, while the entire group would be passed to <code>foo</code> when using <code>apply</code>. The easiest way to understand this is to experiment with a simple dataframe and the generic <code>foo</code>.)</p> <p>The <code>agg</code> method calls <code>foo</code> once for every group, but unlike <code>apply</code> it should return a single number per group. The group is <em>aggregated</em> into a value. A typical use case for using <code>agg</code> is when you want to count the number of items in the group. </p> <hr> <p>You can debug and understand what went wrong with your original code by using the generic <code>foo</code> function:</p> <pre><code>In [30]: grouped['TitleLevel'].apply(foo) 0 2 Name: 1, dtype: int64 -------------------------------------------------------------------------------- 1 1 2 2 3 1 Name: 2, dtype: int64 -------------------------------------------------------------------------------- Out[30]: candidate_id 1 None 2 None dtype: object </code></pre> <p>This shows you the Series that are being passed to <code>foo</code>. Notice that in the second Series, then index values are 1 and 2. So <code>ls[0]</code> raises a <code>KeyError</code>, since there is no label with value <code>0</code> in the second Series. </p> <p>What you really want is the first item in the Series. That is what <code>iloc</code> is for. </p> <p>So to summarize, use <code>ls[label]</code> to select the row of a Series with index value of <code>label</code>. Use <code>ls.iloc[n]</code> to select the <code>n</code>th row of the Series.</p> <p>Thus, to fix your code with a the least amount of change, you could use</p> <pre><code>def promotion(ls): pro =0 if len(ls)&gt;1: for j in range(1,len(ls)): if ls.iloc[j]&gt;ls.iloc[j-1]: pro += 1 return pro def growth(ls): head= ls.iloc[0] tail= ls.iloc[len(ls)-1] gro= tail-head return gro </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload