Tribhuwan University

Institute of Science and Technology

2079

Bachelor Level / Second Year / Third Semester / Science

B.Sc in Computer Science and Information Technology (STA215)

(Statistics II)

Full Marks: 60

Pass Marks: 24

Time: 3 Hours

Candidates are required to give their answers in their own words as for as practicable.

The figures in the margin indicate full marks.

Section A

Long Answers Questions

Attempt any TWO questions.
[2*10=20]
1.
What are the required conditions for error variable in multiple regression analysis? The Internal Revenue Service (IRS) is trying to estimate the monthly amount of unpaid taxes discovered by its auditing division. The IRS estimated this figure on the basis of field auditing labor hours and number of hours of its computers used. The table given below presents these data for the last ten months. i. Develop the estimating equation best describing these data. ii. Interpret the value of regression coefficients. iii. Estimate the actual unpaid tax for field if audit labour hours is 4200 and computer hours is 1600 hours.

$\begin{array}{|c|c|c|c|} \hline \text{Month} & (X_1) \text{ Field Audit Labor Hours in 100} & (X_2) \text{ Computer Hours in 100} & (Y) \text{ Actual Unpaid Taxes Discovered million of dollars} \\ \hline \text{Jan} & 45 & 16 & 29 \\ \text{Feb} & 42 & 14 & 24 \\ \text{Mar} & 44 & 15 & 27 \\ \text{April} & 45 & 13 & 25 \\ \text{May} & 43 & 13 & 26 \\ \text{June} & 46 & 14 & 28 \\ \text{Jul} & 44 & 16 & 30 \\ \text{Aug} & 45 & 16 & 28 \\ \text{Sep} & 44 & 15 & 28 \\ \text{Oct} & 43 & 15 & 27 \\ \hline \end{array}$

$Given \sum X_1 = 12005, \sum X_2 = 4013, \sum X_1 X_2 = 6485, \sum Y = 7428, \sum X_1^2 = 19461, \sum X_2^2 = 2173$
[10]
2.
What do you understand by 'Design of an Experiment'? Physicians depend the laboratory test results when managing the medical problems such as diabetes or epilepsy. In an uniformity test glucose tolerance, three different laboratories were sent $n_t=5$ identical blood samples from a person who had drunk 50 mg. of glucose dissolved in water. The laboratory results are listed below: Do data indicate a difference in the average readings for three laboratories? Use $\\alpha = 0.05$.

$\begin{array}{|c|c|c|} \hline \text{Lab 1} & \text{Lab 2} & \text{Lab 3} \\ \hline 12.1 & 9.3 & 10.0 \\ 11.7 & 11.1 & 10.5 \\ 10.9 & 10.7 & 10.1 \\ 10.2 & 10.9 & 11.0 \\ 10.6 & 9.0 & 10.4 \\ \hline \end{array}$
[10]
3.
Define Type I and Type II error in testing of hypothesis. A psychologist wishes to verify that a certain drug increases the reaction time to given stimulus. The following reaction times (in tenth of seconds) were recorded before and after injection of the drug for each of four subjects. Test at 5% level of significance to determine whether the drug significantly increases the reaction time.

$\begin{array}{|c|c|c|c|c|c|} \hline \text{Reaction Time} & \text{Subject 1} & \text{Subject 2} & \text{Subject 3} & \text{Subject 4} \\ \hline \text{Before} & 7 & 2 & 12 & 12 \\ \text{After} & 13 & 3 & 18 & 13 \\ \hline \end{array}$
[10]
Section B

Short Answers Questions

Attempt any Eight questions.
[8*5=40]
4.
The following ANOVA summary table was obtained from a multiple regression model with two independent variables. i. Determine the mean sum square due to regression, the mean sum square due to error and F-value. ii. Test the significance of overall model at 5% level of significance. iii. Compute coeff of determination and interpret its value. iv. Find standard error of estimate.

$\begin{array}{|c|c|c|c|c|} \hline \text{Souce of variation} & \text{Sum of square} & \text{Degree of freedom} & \text{Mean sum of square} & \text{F-value} \\ \hline \text{Regression} & 12.62 & 2 & ? & ? \\ \text{Error} & 0.78 & 12 & ? & \\ \text{Total} & 13.40 & 14 & & \\ \hline \end{array}$
[5]
5.
What do you mean by non parametric test? Write down advantages of non parametric test over parametric tests? [5]
6.
Bank of Nepal recorded the sex of first 30 customers who appeared last Monday with notation M M F M M F M F M F M F M F M F M F M F M F M F M F M F M F. At the 0.005 level of significance, test the randomness of this sequence. [5]
7.
A study showed the following results for the different age groups. At the 0.05 level of significance, is there evidence of a difference among the age groups with respect to use of mobile phones for accessing social networking?

$\begin{array}{|c|c|c|c|} \hline \text{Use mobile phones to access social networking?} & \text{18-34} & \text{35-64} & \text{65+} \\ \hline \text{Yes} & 60 & 37 & 14 \\ \text{No} & 40 & 63 & 86 \\ \hline \end{array}$
[5]
8.
It is claimed that Samsung and Redmi mobiles are equally popular in Kathmandu. A random sample of 500 people from Kathmandu showed 300 have Samsung mobile. Test the claim at 5% level of significance. [5]
9.
An effort to estimate the mean amount per customer for dinner at a major Atlanta restaurant, data were collected for a sample of 49 customers and sample mean is found at 24.80. Assume population standard deviation is 5. a. Compute standard error of mean. b. Find 95% confidence interval estimate for the population mean. [5]
10.
Define Markov chain and its characteristics. [5]
11.
What are the basic concepts of queuing theory? In a super market, the average arrivals rate of customer is 10 per every 30 minutes following Poisson process. The average time taken by the cashier to list and calculate the customers purchase is 2.5 minutes following exponential distribution. What is the probability that queue length exceeds 6? What is the expected time spent by customer in the system? [5]
12.
Write short notes on the following. i. Partial and multiple correlation coefficient. ii. Properties of good estimator. [5]