Analyzing a factious Dataset :School Enrollment
School enrollment is one of the most pertinent issue facing educational institutions whether Public ,Private or Charter schools. From Pre K to University Campuses those in charge of these institution have to make sure that enough students are not only enrolled but stay enrolled to the end of their academic education. Attached with this is the issue of enrolling those with disabilities. This is a short assignment that I was tasked with doing and only a snippet of the data was utilized for this assignment.
/* Inputting the data into SAS */
Data Enrollment;
Input
SchoolID StudentID EntryDate$ ExitDate SchoolYear$ Gradelevel$;
Datalines;
1 71753 08232021 06232022 SY202122 K
1 96831 08262020 01142021 SY202021 3
3 96831 01042021 06182021 SY202021 3
4 29486 09232021 06232022 SY202222 12
2 49281 10042020 06182021 SY202021 7
1 59356 10042020 06182021 SY202021 2
4 40519 08272020 04192021 SY202021 9
Format Date mmddyy10.;
RUN;
Data School;
Input SchoolID SchoolName$ SchoolLevel$ LEA$;
Datalines;
1 Armstrong Elementary Friendship
2 Pierce Middle Friendship
3 Chamberlain Elementary Friendship
4 Collegiate High Friendhsip
Format Length SchoolName20 SchoolLevel30 LEA20.;
run;
Data Student;
Input
StudentID Name$ CurrentGradeLevel CurrentSchool$ StudentWithDisabilities$;
Datalines;
71753 Potter 7 Pierce True
96831 Hamilton 4 Chamberlain False
29486 Taylor 12 Collegiate True
Foramt Length Name60 CurrentGradeLevel CurrentSchool20.;
run;
PROC CONTENTS DATA = STUDENT School Enrollment;
RUN;
Analyzing the Data Set
The following questions needed to be answered using either Sql, SAS (data step) or Proc Sql. I decided to use Proc Sql and SAS Data Step
QUESTION 1
Write a query to find the number of unique students who were enrolled at Friendship Public Charter
School in School year 2020–2021/
Proc sql;
Select distinct studentid from Enrollment
where schoolyear = ‘SY202021’;
quit;
QUESTION 2
Write a query to find those students with disabilities and also list their Id numbers/
/ Students with Disabilities/
Proc sql;
Select studentID, Name from student
where studentwithdisabilities = ‘True’;
quit;
QUESTION 3
This data was analyzed and sorted using SAS and Proc Sql. The Data step was used to input the data into a SAS Academics software. If the data set was large, I could have either saved the data in a file and in filed the data and read it into SAS. The data could be Comma Separated file, Space Separated file or Tab Separated file and pulled into SAS.
In addition, I had the option of adding the data into an excel file save it and use the import procedure to read the data into SAS. To analyze data one can, use either the proc freq, proc means and proc univariate to draw conclusion.
If one needs to plot graphs one can use proc sql proc sgplot to draw graphs and scatterplots. Drawing conclusions depends on the purpose of analyses. Proc here means procedure.
Proc freq data = enrollment;
run;
proc univariate data = enrollment;
run;
In carrying out analyses over the three tables the school can merge the tables using either the Merge Data Step or using Proc Sql Joins such as the left join, right join or full join. To do a merge using the data step the data needs to be sorted out before being merged using a common denominator in this case one can use StudentId or schoolId which is common in the Charts. Since the school Chart lacks a StudentID one can create a new variable called StudentId allowing all three Charts to be joined together in a merge.
proc sort data = student;
by studentid;
run;
proc sort data = enrollment;
by studentid;
run;
Using the Proc sql to join the tables together one can either do a left right or a full join. But unlike using the Data Step using the proc sql does not require the data to be sorted out before doing a join of the table. Once the tables are merged or joined one can do an analysis using either the proc means proc univariate and or proc freq procedure to draw additional conclusions and analysis.
Data NewMerged;
Merge Student Enrollment;
by studentId;
run;
******************************************************************
(The Entire Code Run in SAS on Demand for Academics)
*******************************************************************
Data Enrollment;
Input
SchoolID StudentID EntryDate$ ExitDate SchoolYear$ Gradelevel$;
Datalines;
1 71753 08232021 06232022 SY202122 K
1 96831 08262020 01142021 SY202021 3
3 96831 01042021 06182021 SY202021 3
4 29486 09232021 06232022 SY202222 12
2 49281 10042020 06182021 SY202021 7
1 59356 10042020 06182021 SY202021 2
4 40519 08272020 04192021 SY202021 9
Format Date mmddyy10.;
RUN;
Data School;
Input SchoolID SchoolName$ SchoolLevel$ LEA$;
Datalines;
1 Armstrong Elementary Friendship
2 Pierce Middle Friendship
3 Chamberlain Elementary Friendship
4 Collegiate High Friendhsip
Format Length SchoolName20 SchoolLevel30 LEA20.;
run;
Data Student;
Input
StudentID Name$ CurrentGradeLevel CurrentSchool$ StudentWithDisabilities$;
Datalines;
71753 Potter 7 Pierce True
96831 Hamilton 4 Chamberlain False
29486 Taylor 12 Collegiate True
Foramt Length Name60 CurrentGradeLevel CurrentSchool20.;
run;
PROC CONTENTS DATA = STUDENT School Enrollment;
RUN;
**************************************************************************
Question 1 Write a query to find the number of unique students who were enrolled at Friendship Public Charter
School in School year 2020–2021/
**************************************************************************
Proc sql;
Select distinct studentid from Enrollment
where schoolyear = ‘SY202021’;
quit;
***************************************************************************
Question 2 Write a query to find those students with disabilities and also list there Id numbers/
/ Students with Disabilities/
Proc sql;
Select studentID ,Name from student
where studentwithdisabilities = ‘True’;
quit;
Proc freq data = enrollment;
run;
proc univariate data = enrollment;
run;
Proc sgplot data = enrollment;
histogram schoolID;
run;
proc sort data = student;
by studentid;
run;
proc sort data = enrollment;
by studentid;
run;
Data NewMerged;
Merge Student Enrollment;
by studentId;
run;
****************************************************************************