Thursday, January 22, 2015

Rating US Colleges and Universities: An Inconvenient Reality

The US Department of Education/Duncan proposal (Postsecondary Institutions Rating System or PIRS) to grade America's colleges and universities -- at the moment into three still vague performance categories -- has not yet issued in any detail.  Representations have been that three factors are involved:  Affordability, access, and results.  Implicit has been that the three factors will need to be measured using data already Federally available, byproducts of various Federal programs, including ones not directly involved in the various Federal education "Title" authorities.

If one had just landed on earth from a distant planet, with the technological prowess that implies, the notion that over 4,000 diverse higher education institutions could be successfully characterized and rated by those three factors might actually seem to make sense.  What could be simpler:  Do a nation’s applicable citizens have equal access to those institutions; can they afford the price of attendance; and what has been the value added by their participation?

After a few trips around the societal track, that visitor from another place becomes linguistically proficient and starts to understand organizational behavior and our societal hangups, concluding the proposed scheme for characterizing an educational institution, by analogy, has the credibility of studying earth's life and its behavior by simply designating it bacteria, archaea, or eukaryota.  (We as multicellular organisms are constructed of eukaryotes, microbes, et al., but that true depiction falls a bit short of characterizing the sentient human.) 

In fact, the scheme proposed by the US Department of Education is a total whack job, calling into question Secretary Duncan’s intellectual competence, or surfacing the question of what values and ideological excursion precipitated the proposal?

Both the rating scheme, and in fairness this writer’s challenge, fit the trope  “says easy, does hard.”  Let the reader be the judge, based on the reality of the behaviors, factors, and rating process being proposed for PIRS.

What are the issues?

The US Department of Education/Duncan depiction of the need for this scheme remains vague; what are the reasons the proposal has been floated now, and how do they hold up under scrutiny:

  • Are the proposed ratings – even if valid and reliable – needed?
  • What is the valid unit of analysis, i.e., the total institution, intra-institutional colleges and schools (there may be great variance inside an institution.)
  • On any factor requiring differentiation to constitute a rating basis, is there greater intra-organizational variation than variation among institutions?
  • Will the ratings differentiate institutions judged deficient in providing equitable access?
  • Will the ratings differentiate institutions based on cost of delivery of a degree; will those costs be comparable based on the quality of the degree delivered?
  • Subsumed in the above, how are the times for delivery of a degree accurately determined?
  • How is it determined that ratings of institutions are based on valid assessments of comparable institutions?
  • How will the punitive measure proposed change institutional behaviors?
  • How does a limited number of ad hoc measures of existing variables translate into a rational scheme to measure performance of any institution that is, de facto, a system and complex layered organization?
  • Are the variables proposed up to the task of alleged measurement: Genuine accessibility–true net cost–education value added, and valid comparisons?

In the wake of undercutting of genuine learning experiences by dogmatic Federal pursuit of standardized testing as the backbone of US public school reform, it seems fair to propose that future initiatives be judged by one of the same standards as medical practice – first, do no harm.

Ratings needed?

There are currently in excess of 20 US web sites devoted to search that can mate a collegiate prospect and a college or university, and multiple ratings already published, e.g., US News, Forbes, Princeton, et al.  Add the online sites of virtually every credible college and university.

The categories of information may not be comparable among these sources, and they have variable credibility; however, the proposed assignment of multi-thousand institutions into three crudely defined hoppers, even if those assignments were valid, appears destined to add nothing to a prospect’s effective discrimination in choosing a collegiate destination.

The unit of analysis?

A fancy phrase for a core issue; what measure of homogeneity, or level of disaggregation makes institutions being assessed comparable?

This is an example previously used, but it makes the point:

“Indiana University (IU) has two main campuses, Bloomington and Indianapolis, different academic environments.  It has six regional campuses. The Bloomington campus has 14 separate schools plus a College of Arts and Science.  All 15 major units have multiple departments, multiple faculties, heterogeneous curricula (and some institutions differential tuition) — that factually determine the quality of a degree — with 180 majors, in 157 departments, representing 330 degree programs.  The other campuses have variable presence of the same venues, plus where a campus is a joint IU-Purdue campus, there may be additional departments representing engineering, nursing, et al.”

What is the appropriate unit for measurement:  The composite institution; each campus location; the college(s) embedded in each campus; the various schools; even subject matter departments that may be as large in student enrollment as some small colleges?  Those differences in programs and enrollees may produce very different results for the variables proposed as the basis for ratings.

Foreknowledge of the universe?

Is there any a priori basis for the Department/Duncan proposals based on even sample research of how ratings factors show dispersion across institutions, or within institutions and across the above potential units of analysis?  Thus far the Department has offered no evidence of prior or ongoing research that would foot any rational proposal of this magnitude and potential for negative effects.

The second factor impacting validity is comparability.  Are any two institutions of higher education comparable given their capacity for independence of action and complexity of offerings?  What research on multidimensional properties has been executed to provide categories of institutions that can arguably be comparable?  The factors allegedly being rated are intrinsically linked to many of those properties, therefore have a potential of being misinterpreted as performance gradients rather than just concomitant effects of those properties.

A college/university is a complex organization.

In the rush to rate higher education institutions a fatal error is failure to recognize that every college and university, even the most austere, is a level of magnitude more complex as an organization than, for example, a public school that has narrower roots, fewer human resources, and relatively a fairly simple organizational structure; even with those similarities our public schools are not automatically comparable in assessing learning performance or even test-based metrics.

Breathtaking is the naïveté to believe any organization, and ones as complex as a college or university, could be assessed for quality based on a handful of incomplete or flawed variables (if that is the true motivation, venality if it is not).

The scope of measurement of organizational performance – especially for an entity as layered and complex as a college or university – is impossibly beyond the scope of this blog.  Many assessment models exist, and the real factors, variables, functions, actors, and internal behaviors that foot an organization’s true performance are massive.  Just one example of such a guide to determinants of performance is linked here.   The Department/Duncan model is roughly the equivalent of trying to build a real operating system with Legos.

Assessing student access to higher education?

As complex as every other factor footing the proposed rating scheme, this one is presently categorically blocked by both a lack of longitudinal research on how admittance is sought and played out in real time, and confidentiality law installed by Congress.  To answer this question would require comprehensive access to college applicant records leading to acceptance or rejection, not permitted by law except at the moment available to the applicant. 

The latter access was just exploited by a cluster of Stanford University undergraduates, who demanded and received their full files.  The results underscore the complexity and nuances of the admissions process; such full disclosure would be needed to assign faults for failures to admit, and to attribute that failure to some form of discrimination other than student performance criteria.

Time to acquire a diploma as a performance factor?

On its face this factor appears one, that coupled with the cost of the educational experience, might be defensible.

In 2011 a group within the US Department of Education was tasked with assessing the factors that might be measured for rating colleges/universities, initially targeting two-year institutions.  Of the multiple factors noted above, only one was thoroughly vetted – the time required to acquire a degree/diploma.

At the moment the only data the Department has to quantify that factor is the measurement of the number of years taken to acquire a degree or diploma, by a first-time, full-time degree seeking student.  As focus shifted to four-year as well as two-year programs, it is from that narrow data concept that the various alerts have come, stating that some material percent of BS/BA level students fails to get a degree within the nominal four years, and now six years.

The Department’s own report, citing the errors in that measure, because it did not track transfers and possible degree completion or subsequent degree pursuit and acquisition after the initial drop out, has seemingly been ignored in the PIRS ratings quest.  In short, that six-year figure for a four-year degree, popularized by our press, is likely a misrepresentation of reality with little or no research undertaken to rectify that to pursue the ratings.

Still another idea floated, use of Federal job placement data of new graduates as a surrogate for quality of education delivered.  Your average eighth grader could slam that rendering of uncritical thought; at the most basic level, starting salaries of new graduates are tightly linked to job and professional service type, and our institutions are diverse in occupational preparation supplied, therefore salaries are confounded with job type.  As the occupational types number in the hundreds, type would have to be held constant to impute a salary quality indicator.  The universe of college and universities categorically can’t support the data logically needed.

Punish to change?

First question is, to change what; the time to degree, the net cost, the quality of learning generated?  The first item is unresolved, the second subject to measurement of a total cost to the student as yet undefined, and the third will allegedly not be attempted.  One hammer proposed is tying availability of Pell Grants to a college's or university's rating.  Other public critiques of PIRS suggest, that because of the crude reasoning and categories footing the scheme, redirecting Pell Grants may actually worsen support for collegiate candidates most needing support. 

Next, will the crude ratings being proposed by the Department/Duncan affect the behaviors and performance of the institutions targeted?  Because of the complexity of decision making in present higher education, with the layers of stake holders, it is highly questionable even if the ratings induce greater deliberation.  Using the prior IU example for a moment, student financial aid measures roughly seven percent of composite cash flow associated with annual operations, and that does not include the influence of endowment funds flowing to the institution.  Presently, the departure or hire of a handful of sports coaches in some quarters might have greater impact than everything the US Department of Education can use to put a brand on an institution.

The list goes on, to where?

Pre-dating NCLB, and blossoming in the period immediately prior to the Obama Administration’s installation, there was a small explosion of studies and conferences addressing the core issues surrounding change in America’s colleges and universities.  Some of the most comprehensive work, now simply being repeated in most discourse on higher education change, was originated by the Association of American Colleges and Universities, and by a small number of states, the latter focusing on measurement of the quality of community college outputs.  This work was seemingly lost in what subsequently became, it is asserted, an unthinking and unreasonable commitment by the Department/Duncan to ideologically driven postsecondary reform tactics.

This generic topic is only scratched by the above observations.  There is cause to argue that America’s colleges and universities should be assessed for mission, and for operating performances that miss or contradict the mission. Staying with the former academic stomping grounds for an example, and with a prior small window into IU’s 2014 strategic planning for its Bloomington campus, the resultant plan was narrow in perspective, institutionally self-centric, virtually void of any recognition of the national and strategic issues that vex present higher education.  Procedurally the planning process was less than inclusive, literally taking properly credentialed faculty representation out of the loop, substituting a set-piece of submissive faculty for broader campus faculty input.  Change is arguably needed in present US higher education organizational leadership as well as in the mechanisms of pursuit of student learning.

But overall, the present US Department of Education/Duncan initiative is arguably the flimsiest and most disingenuous proposal thus far for the purpose of producing positive change in our collegiate institutions.

There is lastly also obvious room to argue that none of the narrow and simplistic reform designs currently being floated for higher education, irrespective of the origin, should be permitted to advance without some meaningful research that first codifies key characteristics and performance indicators for all 4,000 plus institutions, or minimally a projectable sample of those institutions.  Sequentially, that likely is not possible without creativity currently evading higher education, and a new level of inter-institutional conversation and cooperation among university leaderships, along with comparable states’ cooperation via perhaps the National Governors Association (NGA).  The assumption is that the present US Congress is unlikely to grant such power for discovery to the present White House.

Conclusions?

Viewed against the common sense of most of Tuesday’s SOTU address by Mr. Obama, this proposal simply doesn’t satisfy a “sniff test.”  The complexity of the mission, juxtaposed against the ignorance and ad hoc tactics proposed to rate higher education, has to be viewed as failed logic and programming.  Compared to pragmatically failing testing-only based alleged reform being impressed on public schools, this proposal is not the product of competence that should guide national education advocacy. 

American public higher education that was formerly dominated by state funding and occasionally adequate oversight has executed a 180 over the last several decades.  For example, using IU again for convenience, that university system’s funding from the State of Indiana is now less than 24 percent of total annual revenue.   There is an inevitable loss of practical public control of oversight of institutions that must retool to support themselves.

Our collegiate managements reflect intelligent and highly educated human resources, but are as vulnerable as any private sector firm to managerial failure; perhaps to a greater extent in many institutions where leadership has come through the academic ranks and lacks the managerial expertise demanded in the private sector.  That has become increasingly evident in higher education leadership’s emulation of corporate leadership that formerly dismissed strategic thinking.  In short, our collegiate leaderships can learn something from our private sectors and from resources who have pioneered change in management thought; the question is whether leaderships will register that in time?

America’s colleges and universities are also vulnerable to obsolescence in spite of the intellectual capital they inventory.  Change is needed, as suggested in a prior post, to:  Prioritize the real missions; get on the same page in providing information for potential students; make the process of accepting students as transparent as possible within the context of existing confidentiality laws; address the phenomenon of substituting part-time faculty for tenured and tenure-track teachers, or verify that the former’s vetting equals traditional scrutiny; combine cost effectiveness initiatives with learning output assessment to increase productivity; get back to four years (or two years) means “four years;” consider the possibility that “lean” techniques applied to industry do have a role in education; and move beyond present institutionalization of curricula to aggressive updating of knowledge being offered.

Lastly, it is impossible to avoid the reality (provocative to the guilty) that a whole lot of America’s higher education shortfalls do not spring from higher education, at least tactically, but because US public schools, and especially the secondary grades are simply not performing.  Over a dozen years NCLB, in spite of the hype, has produced from a quarter to a third of America’s children that have been “left behind” in spite of the hype, and will struggle to get beyond that fate.

There is really no mystery why America is still in a form of educational crisis – you only have to pull cognitive function out of where it has been slumbering. Look critically at too many of our local schools still dug in to last century’s rituals and knowledge obsolescence, refusing change, exhibiting administrative venality, and BOE that are unprepared or misdirected. That is amplified by inadequate teacher training by our schools of education, offset only by the better fraction of US teachers who have internalized stronger academic values and taken the initiative to advance their own learning and classroom skills.

Perhaps there is discovery afoot precipitated by a shift in emphasis to higher education:  That a century, of disassociating US public PreK-12 systems and practices from the post-secondary education function, has to come to an end, or will at least begin to register educational and legislative awareness?

Monday, January 12, 2015

US Education Reform: Stumbling "Through the Looking Glass"

Lewis Carroll’s “Through the Looking Glass” seems an appropriate metaphor for the distorted cognition and magical thinking characterizing current alleged reform of US public schools, and now prospectively its colleges and universities.  The premise is, present education reform illogic fits.

Upside Down and Catawampus

The US has now endured over a decade of public K-12 education infighting, but on a battlefield resembling current real ideological warfare; multiple adversaries with some trouble defining the good guys versus the bad guys.  Combatants:  Our entrenched public systems; NCLB; NGA; the latter’s spawn, CCSSI; ALEC; testing companies; state education bureaucracies and legislatures; charter entrepreneurs; anti-testing coalitions; anti-CCSSI coalitions; sundry education opportunists; even direct parental action to block the testing tsunami.  The dispersed power blocks on all sides of the skirmishes promise no easy or quick resolution.

On the table but still lacking execution, the Obama/Duncan proposal to grade US colleges and universities.  That proposal’s dubious distinction; trying to scale performance of 4,140 higher education institutions with a handful of available variables already possessing metrics.

Now, the latest evolution of NCLB, Mr. Obama’s “line in the sand” doubling down on standardized testing.  Mr. Obama’s lines in the sand, however, have proven to be less that durable.

Last out of the chute, the proposal for free tuition to two years of community college, reflecting little transparent awareness of the implications of further loading up enrollments for community colleges, with largely unknown intellectual provenance and capacities for quality learning.

The take from all of the above initiatives is that there is a root agenda that has been put in place by the Obama Administration – distinct from the origins and original highway for corporate reform, but borrowing its standardized testing/punishment hammer – and one of its targets encapsulating utopian educational equity is ‘some college for all.’  This ideological tenet hasn’t been sufficiently challenged.

Three overarching shadows sully this grand vision:  One, there is no present strategic support for the notion that all of America needs or wants a collegiate diploma; two, the proposal crudely ignores the reality that failing public K-12 has created and exascerbated the need, but piling another challenged system on prior failure isn’t a fix; and three, the entire reform movement totaled the reform bus before it was out of the terminal. 

Specifically, every reform scheme floated has adopted some quick and dirty end game assessment to drive change, but by ignorance or haste ignored the essential linkages between where performance is flagged, and the underlying organization and processes that actually cause and change that performance.

Four logical conundrums weaken the foundations of present education reform models:  Deconstructed knowledge does not equal critical thought and sustainable learning; academic organization is not a monolithic ‘it;’ the economics of learning quality assessment and assurance are real and critical; and egregiously, where have all the sages gone along with “the cooperative principle?”

Deconstruction Naïveté

Deconstruction, and its Siamese twin analysis, have always been the lally columns of K-12 education.  Break any knowledge into its constituent parts, memorize them, and voila, learning?  Oversimplified, but the core model still dominates public education's conceptual thought processes.  The parts have been over time connected, extended to constructs/relationships formed, but still fail any test of more advanced understanding of the science of explanation and prediction.

The reasons go back over a century, and form the roots of divergence, to the present day, between higher education and our public schools.  The early intellectualism that sculpted public schools, whether from a learning path, or more likely the ego driving public K-12 pioneers to want their own identity, created a system of education for education that never aligned with the science of inquiry and explanation driving collegiate education.  The process of conveying bits of knowledge, and especially the supporting classroom protocols, became public K-12's dominant theme. The application of knowledge components to larger constructs and models, explaining behavior of phenomena, was either lost in teaching preparation or was simply never understood by the public K-12 teaching factory.

Offering the benefit of a doubt, it seems incongruous that the high level leadership currently flogging test-based school reform can be unaware of the learning dysfunction and deficits imposed by those venues and tactics?  The obvious questions:  What leadership values are driving “corporate reform;” what ideologies can justify the negative strategic learning effects of present reform tactics; and is there in that thinking any calculus for the downstream effects of the approaches? 

Lastly, literally screaming at one, the hypocrisy of Obama/Duncan; specifically, employing the trope "college readiness" from virtually PreK on, while arguably aware that collegiate academics engage a different cognitive set and mechanisms than transient early learning based on memorization and ritual learning.

Testing Versus the Mechanisms of Performance

In an article in the January 10 Washington Post, unfolding Mr. Obama’s proposal for a free two years of community college, the reporting also covered this Administration’s “line-in-the-sand” commitment to standardized testing. An admittedly overused cliché, but that reaffirmation appears the humorous definition of insanity – "continuing to do the same thing but expecting a different result."

The same article featured a quote from Charles Barone, “…policy director of Democrats for Education Reform and who helped write No Child Left Behind as a congressional aide,” and who was quoted:  I don’t know how else you gauge how students are progressing in reading and in math without some sort of test, some kind of evaluation." ”If you want to see a kid’s vocabulary, how they write, if they can perform different math functions, the only way is to sit them down and give them a test.” 

Intellectual and sane policy?  We don’t know what that learning is supposed to be except as defined by magical third-party testing.  We reject the view that our teachers can ensure learning and assess classroom formative or summative performance without the 'psychometrician in a bubble.'  But externally testing until hell freezes over will surely provide that enlightenment?

Let’s try a hypothetical.  You manage a division of a technology firm.  The word comes down; the corporation needs a state-of-art xflipvoxcomp (a computing device qua voice recognition qua AI) to fill a market segment gap in the corporation’s consumer technology offerings.  An obvious next step; you query topside, what are the product performance and design goals, target market positioning, and pricing-cost-incremental investment criteria for the development?  The answer comes back:  We don’t have a clue, but we’ll be testing your result the minute it is prototyped to see if you keep your job.  Duh? 

Whether prompted by ignorance, or venality, or simply ideologically driven thinking, this second factor rivals the first in undermining the alleged logic of present public K-12 reform, now proposed by Obama/Duncan to be extended to our 4,140 colleges and universities by a simplistic rating scheme.  No acknowledgement of the factors or processes that ultimately determine whether a desired learning effect is achieved; no acknowledgement of the organizational complexity of collegiate structure; no acknowledgement of the delta separating teaching assets and process in collegiate settings, versus the assets and administration in public K-12; and no acknowledgment colleges and universities are systems featuring even semi-autonomous layers of sub-systems because of the role of faculty governance.

By what logic of systems' thinking is it assumed that beating on the aggregate of a collegiate institution with ratings will produce positive change in learning process and performance?  There is some evidence that pseudo social science, like the US News' and Forbes' collegiate rating schemes, have produced dysfunctional tweaking of academic recruiting and reporting, obscuring rather than clarifying information for those seeking higher education options.

Higher education’s sample look-alike for public K-12 testing cheating isn’t a great reach; for example, a direct and quick way to meet the time-to-diploma criterion being flagged is dysfunctional, surreptitiously reducing the requirements for achieving the diploma.  Not exactly a useful strategic quality goal for America's higher education trajectory?

Achieving Quality Learning

Virtually from the first, early 1980s rhetoric about change in public K-12 education, the arguments were characterized by aggression and retribution for perceived wrongs.  In public K-12 those offenses seemed to revolve around the perception that our public schools had become ideologically socialistic, more concerned with student self-esteem and vague learning objectives than preparation for succeeding in our market-based systems.  Hence, the earliest reform language prominently stressed “accountability,” the presumption apparently that there was none. 

The basic premise of both public K-12 and now prospectively higher education change, seems to be that it must be punitive to create motivation.  Is the implicit assumption for collegiate reform that the genre is elitist, and needs to be punished?  The corollary of that in present reform is that the good guys and the bad guys must be sorted by the analogous process to manufacturing quality control; inspect, measure, correct flaws, scrap out the offenders.   That logic worked for early decades of the 20th century for American industry, it should work for education?

One small glitch:   In the private sector quality achievement of product or service output was displaced post WWII by a cluster of routines, starting with the work of Juran and Deming among others on statistical quality assurance techniques, dramatically reducing the cost of achieving quality.  That was followed by the Japanese revolution in TQA, or total quality assurance, that changed the auto and subsequently most other US industries.  The concepts of process control emerged to place assessment far earlier, and continuously, in the evolution of output, even eliminating traditional late stage inspection logic, further reducing costs and ensuring quality.  Lastly, the contemporary concept of how organizational performance is motivated and achieved is not your granddad's.

These are not soft arguments, but hard economic realities.  By delaying quality assessment until the product pops off its assembly line, the cost of a quality deficit soars.  The earlier in the process error is detected, and the more traceable the assignment of cause, the minimum resources are scrapped or wasted, the lower the cost of output, and the lower the opportunity cost of the total assets deployed.

Applying this to education systems is not rocket science:  Among many genre of processes creating utility, education has the most to lose, by its recipients, and by its agents.  The costs, economic, social, and opportunity of discovering flawed learning only after that process has reached a terminal point are major.  The effects for education recipients may not even be recoverable.

Flat out, the present mechanism of trying to change our educational effects and productivity by testing or grabbing metrics, after the processes for learning are already expended, is somewhere between senseless and insanity. Present extravagant testing and post-instruction measurement leave systems clueless about sources of need for internal change, and defensive.  The fix is to employ systems thinking in how student learning is achieved, ultimately knowing how the factors of learning’s processes interactively work, focusing quality planning and assessment in the earliest stages then extended continuously through education process.

Reformers’ reliance on a nearly century old, and arguably obsolete conception of quality assurance is almost inconceivable, but has been the key motif of alleged reform.  A bold-faced ‘why’ is certainly a component of another needed test of accountability – this one for those prosecuting present reform?

Trashing “the Cooperative Principle”

As contradictory of American intellectual achievement as current “corporate reform” and proposed higher education attacks are, and as dismissive of professionalism, the fourth issue with present education duress may be the most egregious.  It is driven by evolving disregard for “the cooperative principle,” defined as “specific rules for conversations,” or the social interactions that in civil societies become the basis for successful negotiation and problem solving.   

Overstatement?  Develop metrics that will measure the volume of constructive, cross-aisle communication in our 2015-2016 US Congress?

The US has now experienced the first 30 years of challenge of public education; how many more decades of opportunity costs should this nation incur before critical thinking about critical thinking finally emerges?  The reformers have become legions with differing interpretations of reform, with different values and tactics, and none show the capacity to either listen to the targeted systems, or communicate among factions in any arc.  There are two perspectives footing this segment of critique:  How did embryonic education reform become so contentious, and what is driving this societal conflict?

The first question has a discrete answer in the case of public K-12.  It begins with former President Reagan’s refusal to name in the 1980s the National Commission on Excellence in Education, followed by US Secretary of Education T. H. Bell’s creating that body on his own authority and naming its members; the Commission chair, David Pierpont Gardner, an accomplished higher education administrator associated with the University of California, then President of the University of Utah.  His biography is impressive but gives no hint that he was well versed in public K-12 issues.  The product of that Commission was “A Nation at Risk” (ANAR), the report that politically launched “corporate reform,” subsequently precipitating "No Child Left Behind" (NCLB).

Simultaneously a team led by Dr. John Goodlad, equally applauded but for public K-12 education leadership and pursuit of change in public K-12, was completing the only large field study of US public schools, covering 27,000 children and a carefully stratified sample of systems.  As ANAR was being drafted, the Goodlad team’s results – suggesting a vastly different and strategic approach to changing public K-12 education – were requested and presented to that Commission.  Those results, from Dr. Goodlad’s subsequent narratives, were ignored because the Commission wanted ANAR to issue a “thunderclap” that would startle and panic Americans, justifying an aggressive public school reform agenda. 

There is no way to reconstruct what might have been, but the contents of John Goodlad’s work suggest America might be an epoch ahead had myopic and politicized results and policies not prevailed.

Part two of this factor seems to mirror our political milieu:  Extreme partisanship; unwillingness to compromise; dogmatic refusal of transparency; unwillingness to communicate across education fiefdoms; perhaps evolution of values and even the meaning of language that makes exchanges for problem solution turn into warfare; and increasingly dissolution of former virtues that made self-interest and power trips the stuff of many public school administrators, college administrators, BOE, and higher education boards of trustees.

Particularly damaging to American public education is that the above seem to have become endemic in our society.  Call it organizational isolationism, or circling-the wagons, but education enclaves from local schools and especially their BOE, through college and university administration, currently demonstrate the incapacity for cross-group communication and problem solving.

Our media have documented that the US Department of Education and especially its current leadership, have been neither good listeners to systems' feedback, nor receptive to education expert critique of policy.  Have our state education bureaucracies been any better?  The long view of public education reform in this beginning of a new year is that none of the critical factors, effecting either PreK-12 or higher education quality and performance, have dramatically or even more than marginally improved.  

Backing Out of the Looking Glass

The above arguments dispute some of today’s education Pollyannaism, that sees our systems now moving to learning, enlightenment, and goodness.  One has to ponder that Obama/Duncan and the back rooms that have powered present accountability attempts, may have with utopian visions, but precipitating unintended consequences, accepted and nurtured a test-based reform activation model that is flat out dysfunctional.  As long as public school success continues to be tautologically defined by the same standardized testing – supplied by the same developers and vendors of testing reflecting vested interest – that constitutes its measurement, the claim is false.

There is an obvious mechanism for objectively and empirically testing present testing initiatives.  It involves creating a consortium of America’s highest rated foundations/think tanks, with demonstrated objectivity on the mechanisms for public K-12 assessment.  

The mission would be sponsoring a three-phase higher education-staffed research effort:  To first assemble more robust models of needed learning, by grade band, by knowledge types, free of political ideology; two, do the meta research needed to create testing representative of each of those learning models (much already exists but has been with prejudice ignored); and three, execute sample-based field assessments of the various test logics, with the same rigor and controls already illustrated by accepted NAEP testing. Standardized test versions are part of the assessment; the question, what parts of more valid learning assessment can they replicate?

One hypothesis is that some to much of present standardized test contents has relevance, but selectively by grade band, by knowledge type, and by the epistemology that fits the knowledge.  A second hypothesis is that such a research effort would surface more valid and comprehensive understanding of what constitutes learning, and what configurations are most material for our evolving economy and society.  Almost by definition, the last couple of decades of neural research, implementation still scarce in both K-12 and even higher education pedagogy, would up the game.

The battle, between what education should produce -- recognition, literacy, explanation, measurement, capacity for prediction, capacity for creativity, intellectual values -- and what has been occurring in our systems and society, has been captured by analogy in many of the (economic) assessments of Nobel Prize economist Paul Krugman.  There was no resisting paraphrasing one of Dr. Krugman’s trenchant New York Times editorial offerings, spinning it to reflect our educational malaise.  With apologies:

The main point is that we’re looking at political and educational subcultures in which ideological tenets are simply not to be questioned, no matter what.  The vendor-driven and psychometrically defined testing is valid no matter what actually happens to the student’s capacity to critically think and create, classroom teaching without the ritual mechanics of school of education mantras must be a failure even if it’s working, and anyone who points out the troubling facts is ipso facto an enemy.

Epilog

Next post will tackle the earlier higher education question:  If you wanted to rigorously, and with any hope of measurement success, create a scaling model for our colleges and universities, what factors would you target, what units of analysis would you employ, what variables would you seek to make metrics, and how would you stratify/cluster institutions to allow valid comparison?  How would you attempt to combine what is measurable into some composite normative model of institutional quality?  How would you accommodate the internal variability in institutional quality?  Lastly, how would the modeling and metrics produced be structured and communicated to our potential college matriculates to become more meaningful information for choice?


RPW 



Sunday, December 28, 2014

Assessing US Higher Education: Information, Intimidation, Ignorance, or Insanity?

The last post of Edunationredux offered a partial critique of the Obama/Duncan scheme to rate America's colleges and universities. Prior national critique reflected almost a "you gotta be kidding" ambience, illuminating the perceived chasm between what Arne Duncan and the US Department of Education are proposing, and anything resembling intelligent social science applied to the measurement task.  Today’s post extends the prior critique, exploring the real measurement chores needed to create valid and reliable ratings of America's colleges and universities.

That chasm between the proposal and reality is so great it raises major questions; what conceptual malaise and what leadership degradation have occurred in that Department, who is steering this measurement debacle, and what resources are executing the work.  Is the proposal chain-rattling just to get the attention of higher education leadership?  If the intent is to actually carry through the scheme, is this another Federal agency that has now lost steerage, and mismatched the resources needed to actually conduct competent education work?

Post Critique, Critique

One tiny slip in the pronouncement of a functionary in the Department of Education may have given away the naïveté and slanted thinking footing the current proposal:  One of the factors allegedly being considered was how to treat "improvement" as a variable, and presumably as a simple metric.  The statement infers that the designers of this scheme may see the assessment of our colleges and universities occupying the same conceptual space as improving test scores in a public school system.  There are likely a few community college-scale institutions, close to being simply extensions of high school level performance, where this may be applicable, but any resources knowledgeable about the functions within a major university would deservedly see this as bizarre.

A last retrospective issue is further scrutiny of the misguided proposal to use beginning salaries of graduating students as a basis for institutional assessment.  This component of the proposal has some serious logic issues.  Aside from the nearly impossible chore of equilibrating the professional destinations of students across institutions to create one valid metric (or even multiple metrics), and the cognitive error of relating quality to profession sought, a peek at the distributions of those starting salaries poses an even more daunting issue.  Starting salaries are not distributed normally, but are skewed to the high end. The overwhelming body of starting salaries is so constrained, the distribution leptokurtic, that little or any discrimination among most salaries attributable to institutions could be detected. 

A pretty cynical outcome of using the proposed metric(s) for salaries, aside from all other faults, is that success in that venue would come from maximizing an institution's output of petroleum engineers, and wiping out the education of all PreK-12 teachers.  If the underlying intent of this scheme is some social engineering to equalize higher education opportunity, and social and economic states, its extreme liberal designers need to go back to the drawing board, or better, acquire some higher education.

Fair Challenge

The classic, and legitimate challenge to last post's critique of what's proposed -- that it is a loser -- is provide a more effective system for assessing our institutions.  The remainder of this post takes a stab at that challenge.

Dimensions

The starting point in this quest is identical to every legitimate research effort since the Enlightenment:  What is the goal, what hypotheses are to be tested, what question or questions are being posed for answers; what is the universe from which measurements are sought; what are the variables or factors requiring measurement, and what are their functional relationships to the criterion question(s); what are the properties of the variables, in this instance measurements wanted, i.e., nominal, ordinal, interval, cardinal; what are the hypothesized or measurable distributions of measurements sought; how do the error terms intrinsic to all variables fall out, intra-institutional variance versus inter-institutional variance, driving the comparisons of institutions or institutional subsets sought; what are the weights of contributing variables in forming then informing about the differential effectiveness or qualities of institutions being assessed; and critically, with a finite set of candidates for positioning, how may the units in the universe need to be stratified or clustered to minimize confounding of results attributable to basically different higher education systems being appraised?

Given a US universe of 4,140 institutions of higher education, with internal partitioning that may multiply the actual units of analysis by levels of magnitude, with hypothetically complex variable sets driving the criterion effect, the project is not the simplistic vision of the US Department of Education, revolving around already extant data, but what is now colloquially termed "big data:" "...an all-encompassing term for any collection of data sets so large and complex that it becomes difficult to process them using traditional data processing applications. The challenges include analysis, capture, curation, search, sharing, storage, transfer, visualization, and privacy violations."  The mission here, assigning performance ratings to America's colleges and universities, is arguably the very definition of the analysis challenge described.

Department of Education thinking is apparently to measure some amalgam of institutional functional performance and contribution to social goals.  Both become subdivided into constituent goals that complicate what is proposed and currently measured:  For performance, institutional graduation rates overall versus by students' degree tracks, as well as longitudinally by how the process is finally achieved and the time involved; the learning effectiveness of what's been acquired along the way (made more complex when apportioned among multiple disciplines and degree tracks); the complexity of devising true costs of education delivered, plus the cogent issue of the productivity of all of the assets and operations incurred to produce a graduate; and close to the most salient first use of any assessment, whether the results actually materially impact via improvement the choice processes of prospects seeking higher education.  Also ignored in the Department's rhetoric, the longitudinal complexity of worth of prior learning at exit from the institution, versus its worth at the various career stages the graduate experiences.

Measurement Factor Complications

The performance of our institutions in creating equitable student access may be slightly easier to access in principle, but introduces major problems in execution:  A large multivariate causal set of determinants of schools screened, preceding the issue of differential institutional compliance with equitable admissions, is problematic; the reality that acceptance of those who might be discriminated is also based on the failures or successes of our public K-12 systems, long before an institution's action effecting equity kicks in; and a major barrier to measurement at the level of the individual student/family is driven by confidentiality considerations.

A pre-collegiate experience case in point, familial relationship to this writer, is a collegiate freshman at a major university, majoring in an engineering specialty.  Partially because of the 9-12 work in an effective science high school, this soon-to-be second semester freshman will be moving into second semester sophomore level academic work with perfect "A" grades, primed by the prior high school work.  Adding to the analysis challenge of assessing institutional performance, then, are the assets/deficits that precede and impact acceptance.  The remedial work impeding, or prior learning permitting accelerated collegiate work, becomes another complication in assessing collegiate end-game contribution.

Another set of factors in judging performance is the subjectivity of protocols of collegiate grading, variable among institutions, among schools, among departments, and even among individual faculty.  Without some national, standardized achievement testing, by specific disciplines or academic track of students, the comparative use of even grades and point averages as measures of institutional performance add complexity to any rating scheme.

The prior Edunationredux blog also unfolded another major constraint, comparison of institutions based on the proper unit of analysis as well as assuring comparability, rendering the simplistic measurement chore inferred in the Obama/Duncan thinking the height of amateurism. 

Still another factor ignored in the current conceptualization is the role played by geographic and location factors, perhaps even highly specific location factors related to the population and cultural composition surrounding a student's residential assignment, influencing institutional outcomes.

But there is another gut issue that will at present -- and in the absence of never executed benchmark research on our colleges/universities -- blind side and hamstring the proposal.  That is the core pattern of variance of any variable or factor used as a basis of measurement.  In virtually all diversified and complex systems (precisely what every major college/university is) there is leveling of outputs based on de facto competition.  In common sense terms, there may be more variation of performance within an organization, than among similar organizations, where an attempt is made to sum or average overall experience.  The practical significance, with a small bit of coaching, human experts on higher education can likely identify the better or worse extremities of “high performing” and "low performing" colleges/universities.  The in-the-middle thousands may blur because their performances tend to regress to each stratum's universe mean.  Consider that in the last half century no credible college or university has been put out of business because their outputs were wholly without merit, or their graduates could not acquire employment.

Rank Versus Supply Real Information for Choice

The commercially hyped collegiate rating schemes -- U.S. News, Forbes, Princeton, and et al. -- have been widely criticized for their simplistic foundations, and the reality that they are minimal discrimination of a complex product.  But they, along with such counter productive ratings of “best party school,” are still allegedly used for input to a critical life decision, an American tragedy.  That prompts the leading question:  Is the Obama/Duncan strategy embodied in the proposed rankings one of the worst decisions of this administration, matching or exceeding even the core ignorance of present punitive-based testing in public K-12?  Would far better choices have been, for example, the long view with strategic research to field a legitimate comprehensive rating scheme for our institutions’ multidimensional areas of performance, call it the 'value-rating' model; or a non-punitive and affirmative alternative 'value-choice' model, the mission, providing comprehensive valid and comparable information on all public higher education institutions, letting the user supply their own criteria for use of the information for choice of school? 

Both example approaches start with the same research roots:  A priori judgments of the factors considered central to the quality and equity of higher education delivered, irrespective of whether those factors are presently quantified; next the development work is executed to convert those multidimensional factors, by algorithm or by scaling techniques to create digital metrics for factors.  At this point the approaches bifurcate, value-choice becoming the issue of creating easily accessible and universal databases, placing them in "the cloud" readily available online, searchable via criteria pertinent to the individual collegiate wannabe, or in another possible form as the material for use of simulation to derive optimal choices for a student.  The rest of our real world is inundated with clever "apps," available for even the ubiquitous smart phone.  Publicly accessible digitally, online, the system offers at low or no cost the structured information to personally search possible school choices.  The values or experiences available from a candidate school remain the elections of the potential student and parents, not predetermined by big brother.

The second approach -- value-rating -- does carry out the intent of the Obama/Duncan vision, ordinal rating of institutions, but based on the constituent properties of collegiate value delivery noted for the first approach.  What changes, what additional research is needed?  One model for the second approach might be structured as follows:  The starting point is a quota sample from America's colleges/universities serving as the development base, the sample reflecting meaningful categorizations of our institutions; for factors presumed causal for quality and equitable delivery by an institution, break out programs or tracks that constitute legitimate units of analysis; use a "human expert model" of decision making to create criterion positioning of the sample organizations, for the unit of analysis, by the various factors; then the goodness of fit is tested between metrics devised and expert positioning of all factors/units of analysis, mathematically determining the salience and weighting of factors that fit expert prediction.  Lastly, the metrics proving predictive are tested on a second comparable sample of our institutions for verification.

There are already out there, in the mass of college/university data banked on institutions' web sites, and made available in detail by a plethora of both public and private sector organizations, the raw data to start building either of the above approaches.  Most of our institutions are working with their own game plans, but the composite of data generated could be a starting point, for example, for building a universal higher education database serving the value-choice approach.  A tragedy of our present society is that a Bill Gates, instead of funding programs designed to beat on our public schools with testing, apparently lacked the perspicacity to pursue even his own suite of digital experiences to fund and guide the assembly of a suitable higher education national database?

Can the value-ranking model actually be executed?  It is arguable that it already has been in part, that the logic employed by Tom Peters and his associates in creating the corporate effort, In Search of Excellence, is an early precursor to that approach; it stopped short of seeking to quantify determinants of excellence, but the core idea was successful.  Using the power of that same Federal funding to our colleges/universities serves as an incentive to engage our universities in needed research.  That is a far better use of the incentive than seeking to intimidate our institutions into change by ranking linked to punitive reductions in funding.  Lastly, you are developing metrics that are defined by the real measurement challenge, and not by what was developed for other purposes or is simply convenient.

Conclusion

Historically, toward the end of last century, one of the Presidential Commissions on Higher Education offered the White House and our higher education community very practical recommendations.  They encompassed reducing higher education costs, reforming funding of tuition and other costs of a degree, and cooperation among all of our post-secondary schools to adopt a common set of parameters making available to America's families uniform ways to assess collegiate choice.  Both our college/university leaderships, and our political system quickly rejected all three sets of well-reasoned recommendations.  Clearly, moving either of the above approaches, or anything resembling them to a productive destination would require some new mindsets, among our higher education institutions, and in Federal education leadership's sensitivity to genuine national needs over liberal dreaming.

Counterpoint is that some of our colleges and universities, presumably “reading the room,” have already initiated innovative changes in their collegiate instruction.   Reported in Saturday’s New York Times, changes are occurring in B-schools’ MBA programs -- to emulate the rapidity of change and experimentation from Silicon Valley – and in basic collegiate science courses to move from lecture modes to high student involvement and problem solving.  Long valid patterns of diffusion of innovation will change higher education, even as the critically deficient Obama/Duncan rating scheme is stumbling out of the starting gate.  Perhaps merely the threat of that Federal ‘Franken data’ has stimulated collegiate action?  Incredibly cynical albeit clever if true; but if accurate the rest of program should be given a quick burial.

On real inspection the proposed Department of Education rating scheme regardless of intentions simply reeks of ignorance and flawed understanding of both complex academic organizational behavior, of advanced learning, and of the most basic principles of inquiry and social science explanation.  Their scheme could, analogically, be compared to trying to build a quantum computer using some AA batteries, a photo transistor, a couple of resistors/capacitors, and some wire scrounged from the ties used on garbage bags.  The present scheme, even if Machiavellian, as well as mirroring the mental set that any solution has to be punitive, is wholly unworthy of a Federal education function critical to our nation, and is condemnation of the current resources managing that agency.


Epilog

The next issues of Educationredux will move into challenges and opportunities throughout US higher education that might be areas for measured change along with possible innovations.  First out of the chute will be the footers for more productive higher education experiences -- bridging the chasm between our K-12, especially 9-12 school outputs, and the incoming requirements for collegiate success -- allowing passage through collegiate work with greater learning effect, in shorter periods of time, and therefore with less investment.