Skip to content

Commit

Permalink
Merge pull request #7 from jhudsl/updates
Browse files Browse the repository at this point in the history
reproducing tables
  • Loading branch information
carriewright11 authored May 29, 2024
2 parents 0ddc5b6 + d4b7d65 commit 14fbb92
Showing 1 changed file with 40 additions and 5 deletions.
45 changes: 40 additions & 5 deletions index.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -61,16 +61,16 @@ Yes, indeed there are...
NA and zero values likely mean the nonprofit did not need to submit to the IRS.
It is impossible to know however, if a zero is actually a true zero. NA values could mean something else.

Thus, we will recode asset amount based on a threshold of greater than or equal to 50,000 as high asset and less than 50000 (including zero) as not high asset.
Thus, we will recode asset amount based on a threshold of greater than or equal to 500,000 as high asset and less than 500,000 (including zero) as not high asset.
Note we keep our NA values with this recoding.

```{r}
df_simplified<-df_simplified %>%
# modify Asset amount variable to be numeric
mutate(ASSET_AMT =as.numeric(ASSET_AMT)) %>%
mutate(ASSET_AMT = as.numeric(ASSET_AMT)) %>%
#create a variable about high asset amount (threshold being $500,000)
mutate(ASSET_High = case_when(ASSET_AMT >= 500000 ~ TRUE,
mutate(ASSET_High = case_when(ASSET_AMT >= 500000 ~ TRUE,
ASSET_AMT < 500000 ~ FALSE))
```

Expand Down Expand Up @@ -399,6 +399,8 @@ High_asset_data <- High_asset_data %>%
group_by(NTEE_text) %>%
mutate(Percent_ntee_cat = round(n/sum(n)*100))
High_asset_data
```

Visuals...of the above data:
Expand All @@ -420,13 +422,26 @@ High_asset_data %>%

**this includes all 4,082 organizations**

### Count plots
## Count plots/Tables

### Different kinds of orgs

```{r}
library(forcats)
df_simplified %>% group_by(NTEE_text) %>%summarize(count = n())
library(janitor)
df_simplified %>% group_by(NTEE_text) %>%summarize(count = n()) %>%
mutate(NTEE_text = str_replace(string = NTEE_text, pattern = "NA", replacement = "Unclassified")) %>%
mutate(Percentage = round(count/sum(count)*100, digits = 2)) %>%
arrange(NTEE_text) %>%
adorn_totals("row")
Total_NTEE <-df_simplified %>% group_by(NTEE_text) %>%summarize(count = n()) %>%
mutate(NTEE_text = str_replace(string = NTEE_text, pattern = "NA", replacement = "Unclassified")) %>%
arrange(NTEE_text)
```


```{r}
df_simplified %>%
group_by(NTEE_text, Neighborhood) %>%
summarize(count = n()) %>%
Expand Down Expand Up @@ -469,8 +484,28 @@ plot2
```


**This includes all 4,082 organizations** There was no removal of organizations based on asset amount, just to get a sense of what oganizations are in Baltimore.


### High Asset Orgs

```{r}
High_counts <- df_simplified %>%
mutate(NTEE_text = as_factor(NTEE_text),
NTEE_text = forcats::fct_relevel(NTEE_text, "International Affairs", "Environment/Animals", "Arts", "Religious", "Health","Education", "Societal Benefit", "Human Services", "NA" )) %>%
group_by(NTEE_text, ASSET_High_text) %>%
summarize(count = n()) %>% filter(ASSET_High_text == "High Asset") %>%
mutate(NTEE_text = str_replace(string = NTEE_text, pattern = "NA", replacement = "Unclassified"))
full_join(Total_NTEE, High_counts, by = "NTEE_text") %>%
mutate("Percentage_of_each_code" = round(count.y/count.x *100, digits = 2)) %>%
arrange(NTEE_text)
```



## Distribution of percent AA

Now to take a look at if 50% African American makes sense. What do the neighborhoods look like?
Expand Down

0 comments on commit 14fbb92

Please sign in to comment.