Another word for it

Subscribe to Another word for it feed
Updated: 2 weeks 20 hours ago

Flow: Actor-based Concurrency with C++ [FoundationDB]

Sun, 02/15/2015 - 01:37


Topic Maps

Flow: Actor-based Concurrency with C++

From the post:

FoundationDB began with ambitious goals for both high performance per node and scalability. We knew that to achieve these goals we would face serious engineering challenges while developing the FoundationDB core. We’d need to implement efficient asynchronous communicating processes of the sort supported by Erlang
or the Async library in .NET, but we’d also need the raw speed and I/O efficiency of C++. Finally, we’d need to perform extensive simulation to engineer for reliability and fault tolerance on large clusters.

To meet these challenges, we developed several new tools, the first of which is Flow, a new programming language that brings actor-based concurrency to C++11. To add this capability, Flow introduces a number of new keywords and control-flow primitives for managing concurrency. Flow is implemented as a compiler which analyzes an asynchronous function (actor) and rewrites it as an object with many different sub-functions that use callbacks to avoid blocking (see streamlinejs for a similar concept using JavaScript). The Flow compiler’s output is normal C++11 code, which is then compiled to a binary using traditional tools. Flow also provides input to our simulation tool, Lithium, which conducts deterministic simulations of the entire system, including its physical interfaces and failure modes. In short, Flow allows efficient concurrency within C++ in a maintainable and extensible manner, achieving all three major engineering goals:

  • high performance (by compiling to native code),
  • actor-based concurrency (for high productivity development),
  • simulation support (for testing).

Flow Availability

Flow is not currently available outside of FoundationDB, but we’d like to open-source it in the future. If you’d like to stay in the loop with our progress subscribe below.

Are you going to be ready when Flow is released separate from FoundationDB?

Streets of Paris Colored by Orientation

Sun, 02/15/2015 - 01:12


Topic Maps

Streets of Paris Colored by Orientation by Mathieu Rajerison.

From the post:

Recently, I read an article by datapointed which presented maps of streets of different cities colored by orientation.

The author gave some details about the method, which I tried to reproduce. In this post, I present the different steps from the calculation in my favorite spatial R ToolBox to the rendering in QGIS using a specific blending mode.

An opportunity to practice R and work with maps. More enjoyable than sifting data to find less corrupt politicians.

I first saw this in a tweet by Caroline Moussy.

Mercury [March 5, 2015, Washington, DC]

Sun, 02/15/2015 - 00:47


Topic Maps

Mercury Registration Deadline: February 17, 2015.

From the post:

The Intelligence Advanced Research Projects Activity (IARPA) will host a Proposers’ Day Conference for the Mercury Program on March 5, in anticipation of the release of a new solicitation in support of the program. The Conference will be held from 8:30 AM to 5:00 PM EST in the Washington, DC metropolitan area. The purpose of the conference will be to provide introductory information on Mercury and the research problems that the program aims to address, to respond to questions from potential proposers, and to provide a forum for potential proposers to present their capabilities and identify potential team partners.

Program Description and Goals

Past research has found that publicly available data can be used to accurately forecast events such as political crises and disease outbreaks. However, in many cases, relevant data are not available, have significant lag times, or lack accuracy. Little research has examined whether data from foreign Signals Intelligence (SIGINT) can be used to improve forecasting accuracy in these cases.

The Mercury Program seeks to develop methods for continuous, automated analysis of SIGINT in order to anticipate and/or detect political crises, disease outbreaks, terrorist activity, and military actions. Anticipated innovations include: development of empirically driven sociological models for population-level behavior change in anticipation of, and response to, these events; processing and analysis of streaming data that represent those population behavior changes; development of data extraction techniques that focus on volume, rather than depth, by identifying shallow features of streaming SIGINT data that correlate with events; and development of models to generate probabilistic forecasts of future events. Successful proposers will combine cutting-edge research with the ability to develop robust forecasting capabilities from SIGINT data.

Mercury will not fund research on U.S. events, or on the identification or movement of specific individuals, and will only leverage existing foreign SIGINT data for research purposes.

The Mercury Program will consist of both unclassified and classified research activities and expects to draw upon the strengths of academia and industry through collaborative teaming. It is anticipated that teams will be multidisciplinary, and might include social scientists, mathematicians, statisticians, computer scientists, content extraction experts, information theorists, and SIGINT subject matter experts with applied experience in the U.S. SIGINT System.

Attendees must register no later than 6:00 pm EST, February 27, 2015 at Directions to the conference facility and other materials will be provided upon registration. No walk-in registrations will be allowed.

I might be interested if you can hide me under a third or fourth level sub-contractor.

Seriously, it isn’t that I despair of the legitimate missions of intelligence agencies but I do despise waste on ways known to not work. Government funding, even unlimited funding, isn’t going to magically confer the correct semantics on data or enable analysts to meaningfully share their work products across domains.

You would think going on fourteen (14) years post-9/11 and not being one step closer to preventing a similar event, that would be a “wake-up” call to someone. If not in the U.S. intelligence community, perhaps in intelligence communities who tire of aping the U.S. community with no better results.

OpenGov Voices: Bringing transparency to earmarks buried in the budget

Sun, 02/15/2015 - 00:29


Topic Maps

OpenGov Voices: Bringing transparency to earmarks buried in the budget by Matthew Heston, Madian Khabsa, Vrushank Vora, Ellery Wulczyn and Joe Walsh.

From the post:

Last week, President Obama kicked off the fiscal year 2016 budget cycle by unveiling his $3.99 trillion budget proposal. Congress has the next eight months to write the final version, leaving plenty of time for individual senators and representatives, state and local governments, corporate lobbyists, bureaucrats, citizens groups, think tanks and other political groups to prod and cajole for changes. The final bill will differ from Obama’s draft in major and minor ways, and it won’t always be clear how those changes came about. Congress will reveal many of its budget decisions after voting on the budget, if at all.

We spent this past summer with the Data Science for Social Good program trying to bring transparency to this process. We focused on earmarks – budget allocations to specific people, places or projects – because they are “the best known, most notorious, and most misunderstood aspect of the congressional budgetary process” — yet remain tedious and time-consuming to find. Our goal: to train computers to extract all the earmarks from the hundreds of pages of mind-numbing legalese and numbers found in each budget.

Watchdog groups such as Citizens Against Government Waste and Taxpayers for Common Sense have used armies of human readers to sift through budget documents, looking for earmarks. The White House Office of Management and Budget enlisted help from every federal department and agency, and the process still took three months. In comparison, our software is free and transparent and generates similar results in only 15 minutes. We used the software to construct the first publicly available database of earmarks that covers every year back to 1995.

Despite our success, we barely scratched the surface of the budget. Not only do earmarks comprise a small portion of federal spending but senators and representatives who want to hide the money they budget for friends and allies have several ways to do it:

I was checking the Sunlight Foundation Blog for any updated information on the soon to be released indexes of federal data holdings when I encountered this jewel on earmarks.

Important to read/support because:

  1. By dramatically reducing the human time investment to find earmarks, it frees up that time to be spent gathering deeper information about each earmark
  2. It represents a major step forward in the ability to discover relationships between players in the data (what the NSA wants to do but with a rationally chosen data set).
  3. It will educate you on earmarks and their hiding places.
  4. It is an inspirational example of how darkness can be replaced with transparency, some of it anyway.

Will transparency reduce earmarks? I rather doubt it because a sense of shame doesn’t seem to motivate elected and appointed officials.

What transparency can do is create a more level playing field for those who want to buy government access and benefits.

For example, if I knew what it cost to have the following exemption in the FOIA:

Exemption 9: Geological information on wells.

it might be possible to raise enough funds to purchase the deletion of:

Exemption 5: Information that concerns communications within or between agencies which are protected by legal privileges, that include but are not limited to:

4 Deliberative Process Privilege

Which is where some staffers hide their negotiations with former staffers as they prepare to exit the government.

I don’t know that matching what Big Oil paid for the geological information on wells exemption would be enough but it would set a baseline for what it takes to start the conversation.

I say “Big Oil paid…” assuming that most of us don’t equate matters of national security with geological information. Do you have another explanation for such an offbeat provision?

If government is (and I think it is) for sale, then let’s open up the bidding process.

A big win for open government: Sunlight gets U.S. to…

Sat, 02/14/2015 - 23:58


Topic Maps

A big win for open government: Sunlight gets U.S. to release indexes of federal data by Matthew Rumsey and Sean Vitka and John Wonderlich.

From the post:

For the first time, the United States government has agreed to release what we believe to be the largest index of government data in the world.

On Friday, the Sunlight Foundation received a letter from the Office of Management and Budget (OMB) outlining how they plan to comply with our FOIA request from December 2013 for agency Enterprise Data Inventories. EDIs are comprehensive lists of a federal agency’s information holdings, providing an unprecedented view into data held internally across the government. Our FOIA request was submitted 14 months ago.

These lists of the government’s data were not public, however, until now. More than a year after Sunlight’s FOIA request and with a lawsuit initiated by Sunlight about to be filed, we’re finally going to see what data the government holds.

Since 2013, federal agencies have been required to construct a list of all of their major data sets, subject only to a few exceptions detailed in President Obama’s executive order as well as some information exempted from disclosure under the FOIA.

Many kudos to the Sunlight Foundation!

As to using the word “win,” do we need to wait and see what Enterprise Data Inventories are in fact produced?

I say that because the executive order of President Obama that is cited in the post, provides these exemptions from disclosure:

4 (d) (d) Nothing in this order shall compel or authorize the disclosure of privileged information, law enforcement information, national security information, personal information, or information the disclosure of which is prohibited by law.

Will that be taken as an excuse to not list the data collections at all?

Or, will the NSA say:

one (1) collection of telephone metadata, timeSpan: 4 (d) exempt, size: 4 (d) exempt, metadataStructure: 4 (d) exempt source: 4 (d) exempt

Do they mean internal NSA phone logs? Do they mean some other source?

Or will they simply not list telephone metadata at all?

What’s exempt under FOAI? (From

Not all records can be released under the FOIA.  Congress established certain categories of information that are not required to be released in response to a FOIA request because release would be harmful to governmental or private interests.   These categories are called "exemptions" from disclosures.  Still, even if an exemption applies, agencies may use their discretion to release information when there is no foreseeable harm in doing so and disclosure is not otherwise prohibited by law.  There are nine categories of exempt information and each is described below.  

Exemption 1: Information that is classified to protect national security.  The material must be properly classified under an Executive Order.

Exemption 2: Information related solely to the internal personnel rules and practices of an agency.

Exemption 3: Information that is prohibited from disclosure by another federal law. Additional resources on the use of Exemption 3 can be found on the Department of Justice FOIA Resources page.

Exemption 4: Information that concerns business trade secrets or other confidential commercial or financial information.

Exemption 5: Information that concerns communications within or between agencies which are protected by legal privileges, that include but are not limited to:

  1. Attorney-Work Product Privilege
  2. Attorney-Client Privilege
  3. Deliberative Process Privilege
  4. Presidential Communications Privilege

Exemption 6: Information that, if disclosed, would invade another individual’s personal privacy.

Exemption 7: Information compiled for law enforcement purposes if one of the following harms would occur.  Law enforcement information is exempt if it: 

  • 7(A). Could reasonably be expected to interfere with enforcement proceedings
  • 7(B). Would deprive a person of a right to a fair trial or an impartial adjudication
  • 7(C). Could reasonably be expected to constitute an unwarranted invasion of personal privacy
  • 7(D). Could reasonably be expected to disclose the identity of a confidential source
  • 7(E). Would disclose techniques and procedures for law enforcement investigations or prosecutions
  • 7(F). Could reasonably be expected to endanger the life or physical safety of any individual

Exemption 8: Information that concerns the supervision of financial institutions.

Exemption 9: Geological information on wells.

And the exclusions:

Congress has provided special protection in the FOIA for three narrow categories of law enforcement and national security records. The provisions protecting those records are known as “exclusions.” The first exclusion protects the existence of an ongoing criminal law enforcement investigation when the subject of the investigation is unaware that it is pending and disclosure could reasonably be expected to interfere with enforcement proceedings. The second exclusion is limited to criminal law enforcement agencies and protects the existence of informant records when the informant’s status has not been officially confirmed. The third exclusion is limited to the Federal Bureau of Investigation and protects the existence of foreign intelligence or counterintelligence, or international terrorism records when the existence of such records is classified. Records falling within an exclusion are not subject to the requirements of the FOIA. So, when an office or agency responds to your request, it will limit its response to those records that are subject to the FOIA.

You can spot the truck sized holes as well as I can that may prevent disclosure.

One analytic challenge upon the release of the Enterprise Data Inventories will be to determine what is present and what is missing but should be present. Another will be to assist the Sunlight Foundation in its pursuit of additional FOIAs to obtain data listed but not available. Perhaps I should call this an important victory although of a battle and not the long term war for government transparency.


American FactFinder

Sat, 02/14/2015 - 21:38


Topic Maps

American FactFinder

From the webpage:

American FactFinder provides access to data about the United States, Puerto Rico and the Island Areas. The data in American FactFinder come from several censuses and surveys. For more information see Using FactFinder and What We Provide.

As I was writing this post I returned to CensusReporter (2013) which reported on an effort to make U.S. census data easier to use. Essentially a common toolkit.

At that time CensusReporter was in “beta” but has long passed that stage! Whether you will prefer American FactFinder or CensusReporter better will depend upon you and your requirements.

I can say that CensusReporter is working on A tool to aggregate American Community Survey data to non-census geographies. That could prove to be quite useful.


Thank Snowden: Internet Industry Now Considers The Intelligence Community An Adversary, Not A Partner

Sat, 02/14/2015 - 19:31


Topic Maps

Thank Snowden: Internet Industry Now Considers The Intelligence Community An Adversary, Not A Partner by Mike Masnick

From the post:

We already wrote about the information sharing efforts coming out of the White House cybersecurity summit at Stanford today. That’s supposedly the focus of the event. However, there’s a much bigger issue happening as well: and it’s the growing distrust between the tech industry and the intelligence community. As Bloomberg notes, the CEOs of Google, Yahoo and Facebook were all invited to join President Obama at the summit and all three declined. Apple’s CEO Tim Cook will be there, but he appears to be delivering a message to the intelligence and law enforcement communities, if they think they’re going to get him to drop the plan to encrypt iOS devices by default:

In an interview last month, Timothy D. Cook, Apple’s chief executive, said the N.S.A. “would have to cart us out in a box” before the company would provide the government a back door to its products. Apple recently began encrypting phones and tablets using a scheme that would force the government to go directly to the user for their information. And intelligence agencies are bracing for another wave of encryption.

Disclosure: I have been guilty of what I am about to criticize Mike Masnick about and will almost certainly be guilty of it in the future. That, however, does not make it right.

What would you say is being assumed in the Mike’s title?

Guesses anyone?

What if it read: U.S. Internet Industry Now Considers The U.S. Intelligence Community An Adversary, Not A Partner?

Does that help?

The trivial point is that the “Internet Industry” isn’t limited to the U.S. and Mike’s readership isn’t either.

More disturbing though is that the “U.S. (meant here descriptively) Internet Industry” at one point did consider the “U.S. (again descriptively) Intelligence Community” as a partner at one point.

That being the case and seeing how Mike duplicates that assumption in his title, how should countries besides the U.S. view the reliability (in terms of government access) of U.S. produced software?

That’s a simple enough question.

What is your answer?

The assumption of partnership between the “U.S. Internet Industry” and the “U.S. Intelligence Community” would have me running to back an alternative to China’s recent proposal for source code being delivered to the government (in that case China).

Rather than every country having different import requirements for software sales, why not require the public posting of commercial software source for software sales anywhere?

Posting of source code doesn’t lessen your rights to the code (see copyright statutes) and it makes detection of software piracy trivially easy since all commercial software has to post its source code.

Oh, some teenager might compile a copy but do you really think major corporations in any country are going to take that sort of risk? It just makes no sense.

As far as the “U.S. Intelligence Community” concerns, remember “The treacherous are ever distrustful…” The ill-intent of the world they see is a reflection of their own malice towards others. Or after years of systematic abuse, the smoldering anger of the abused.

In Defense of the Good Old-Fashioned Map

Sat, 02/14/2015 - 18:40


Topic Maps

In Defense of the Good Old-Fashioned Map – Sometimes, a piece of folded paper takes you to places the GPS can’t by Jason H. Harper.

A great testimonial to hard copy maps in addition to being a great read!

From the post:

But just like reading an actual, bound book or magazine versus an iPad or Kindle, you consume a real map differently. It’s easier to orient yourself on a big spread of paper, and your eye is drawn to roads and routes and green spaces you’d never notice on a small screen. A map invites time and care and observance of the details. It encourages the kind of exploration that happens in real life, when you’re out on the road, instead of the turn-by-turn rigidity of a digital device.

You can scroll or zoom with a digital map or digital representation of a topic map, but that isn’t quite the same as using a large, hard copy representation. Digital scrolling and zooming is like exploring a large scale world map through a toilet paper tube. It’s doable but I would argue it is a very different experience from a physical large scale world map.

Unless you are at a high-end visualization center or until we have walls as high resolution displays, you may want to think about production of topic maps as hard copy maps for some applications. While having maps printed isn’t cheap, it pales next to the intellectual effort that goes into constructing a useful topic map.

A physical representation of a topic map would have all the other advantages of a hard copy map. It would survive and be accessible without electrical power, it could be manually annotated, it could shared with others in the absence of computers, it could be compared to observations and/or resources, in fact it could be rather handy.

I don’t have a specific instance in mind but raise the point to keep in mind the range of topic map deliverables.

Principal Component Analysis – Explained Visually [Examples up to 17D]

Sat, 02/14/2015 - 16:37


Topic Maps

Principal Component Analysis – Explained Visually by Victor Powell.

From the website:

Principal component analysis (PCA) is a technique used to emphasize variation and bring out strong patterns in a dataset. It’s often used to make data easy to explore and visualize.

Another stunning visualization (2D, 3D and 17D, yes, not a typo, 17D) from Explained Visually.

Probably not the top item in your mind on Valentine’s Day but you should bookmark it and return when you have more time.

I first saw this in a tweet by Mike Loukides.

An R Client for the Internet Archive API

Sat, 02/14/2015 - 01:19


Topic Maps

An R Client for the Internet Archive API by Lincoln Mullen.

From the webpage:

In support of some of my research projects, I created a simple R package to access the Internet Archive’s API. The package is intended to search for items, to retrieve their metadata in a usable form, and to download the files associated with the items. The package, called internetarchive, is available on GitHub. The README and the vignette have a full explanation, but here is a brief overview.

This is cool!

And a great way to contrast NSA data collection with useful data collection.

If you were the NSA, you would suck down all the new Internet Archive content everyday. Then you would “explore” that plus lots of other content for “relationships.” Which abound in any data set that large.

If you are Lincoln Mullen or someone empowered by his work, you search for items and incrementally build a set of items with context and additional information you add to that set.

If you were paying the bill, which of those approaches seems the most likely to produce useful results?

Information/data/text mining doesn’t change in nature due to size or content or the purpose of the searching or whose paying the bill. The goal is useful (or should be) useful results for some purpose X.

XPath/XQuery/FO/XDM 3.1 Comments Filed!

Sat, 02/14/2015 - 01:00


Topic Maps

I did manage to file seventeen (17) comments today on the XPath/XQuery/FO/XDM 3.1 drafts!

I haven’t mastered bugzilla well enough to create an HTML list of them to paste in here but no doubt will do so over the weekend.

Remember these are NOT “bugs” until they are accepted by the working group as “bugs.” Think of them as being suggestions on my part where the drafts were unclear or could be made clearer in my view.

Did you remember to post comments?

I will try to get a couple of short things posted tonight but getting the comments in was my priority today.

Solr 5.0 Will See Another RC – But Docs Are Available

Fri, 02/13/2015 - 17:08


Topic Maps

I saw a tweet from Anshum Gupta today saying:

Though the vote passed, seems like there’s need for another RC for #Apache #Lucene / #Solr 5.0. Hopefully we’d be third time lucky.

To brighten your weekend prospects, the Apache Solr Reference Guide for Solr 5.0 is available.

With an other Solr RC on the horizon, now would be a good time to spend some time with the reference guide. Both in terms of new features and to smooth out any infelicities in the documentation.


Fri, 02/13/2015 - 16:29


Topic Maps


From the webpage:

An Apache Storm topology that will, by design, trigger failures at run-time.

The purpose of this bolt-of-death topology is to help testing Storm cluster stability. It was originally created to identify the issues surrounding the Storm defects described at STORM-329 and STORM-404.

This reminds me of PANIC! UNIX System Crash Dump Analysis Handbook by Chris Drake. Has it really been twenty (20) years since that came out?

If you need something a bit more up to date, Linux Kernel Crash Book: Everything you need to know by Igor Ljubuncic aka Dedoimedo, is available as both free and $ PDF files (to support the website).

Everyone needs a hobby, perhaps analyzing clusters and core dumps will be yours!


I first saw storm-bolt-of-death in a tweet by Michael G. Noll.

Akin’s Laws of Spacecraft Design*

Fri, 02/13/2015 - 01:16


Topic Maps

Akin’s Laws of Spacecraft Design* by David Adkins.

I started to do some slight editing to make these laws of “software” design but if you can’t make that transposition for yourself, my doing isn’t going to help.

From the site of origin (unchanged):

1. Engineering is done with numbers. Analysis without numbers is only an opinion.

2. To design a spacecraft right takes an infinite amount of effort. This is why it’s a good idea to design them to operate when some things are wrong .

3. Design is an iterative process. The necessary number of iterations is one more than the number you have currently done. This is true at any point in time.

4. Your best design efforts will inevitably wind up being useless in the final design. Learn to live with the disappointment.

5. (Miller’s Law) Three points determine a curve.

6. (Mar’s Law) Everything is linear if plotted log-log with a fat magic marker.

7. At the start of any design effort, the person who most wants to be team leader is least likely to be capable of it.

8. In nature, the optimum is almost always in the middle somewhere. Distrust assertions that the optimum is at an extreme point.

9. Not having all the information you need is never a satisfactory excuse for not starting the analysis.

10. When in doubt, estimate. In an emergency, guess. But be sure to go back and clean up the mess when the real numbers come along.

11. Sometimes, the fastest way to get to the end is to throw everything out and start over.

12. There is never a single right solution. There are always multiple wrong ones, though.

13. Design is based on requirements. There’s no justification for designing something one bit "better" than the requirements dictate.

14. (Edison’s Law) "Better" is the enemy of "good".

15. (Shea’s Law) The ability to improve a design occurs primarily at the interfaces. This is also the prime location for screwing it up.

16. The previous people who did a similar analysis did not have a direct pipeline to the wisdom of the ages. There is therefore no reason to believe their analysis over yours. There is especially no reason to present their analysis as yours.

17. The fact that an analysis appears in print has no relationship to the likelihood of its being correct.

18. Past experience is excellent for providing a reality check. Too much reality can doom an otherwise worthwhile design, though.

19. The odds are greatly against you being immensely smarter than everyone else in the field. If your analysis says your terminal velocity is twice the speed of light, you may have invented warp drive, but the chances are a lot better that you’ve screwed up.

20. A bad design with a good presentation is doomed eventually. A good design with a bad presentation is doomed immediately.

21. (Larrabee’s Law) Half of everything you hear in a classroom is crap. Education is figuring out which half is which.

22. When in doubt, document. (Documentation requirements will reach a maximum shortly after the termination of a program.)

23. The schedule you develop will seem like a complete work of fiction up until the time your customer fires you for not meeting it.

24. It’s called a "Work Breakdown Structure" because the Work remaining will grow until you have a Breakdown, unless you enforce some Structure on it.

25. (Bowden’s Law) Following a testing failure, it’s always possible to refine the analysis to show that you really had negative margins all along.

26. (Montemerlo’s Law) Don’t do nuthin’ dumb.

27. (Varsi’s Law) Schedules only move in one direction.

28. (Ranger’s Law) There ain’t no such thing as a free launch.

29. (von Tiesenhausen’s Law of Program Management) To get an accurate estimate of final program requirements, multiply the initial time estimates by pi, and slide the decimal point on the cost estimates one place to the right.

30. (von Tiesenhausen’s Law of Engineering Design) If you want to have a maximum effect on the design of a new engineering system, learn to draw. Engineers always wind up designing the vehicle to look like the initial artist’s concept.

31. (Mo’s Law of Evolutionary Development) You can’t get to the moon by climbing successively taller trees.

32. (Atkin’s Law of Demonstrations) When the hardware is working perfectly, the really important visitors don’t show up.

33. (Patton’s Law of Program Planning) A good plan violently executed now is better than a perfect plan next week.

34. (Roosevelt’s Law of Task Planning) Do what you can, where you are, with what you have.

35. (de Saint-Exupery’s Law of Design) A designer knows that he has achieved perfection not when there is nothing left to add, but when there is nothing left to take away.

36. Any run-of-the-mill engineer can design something which is elegant. A good engineer designs systems to be efficient. A great engineer designs them to be effective.

37. (Henshaw’s Law) One key to success in a mission is establishing clear lines of blame.

38. Capabilities drive requirements, regardless of what the systems engineering textbooks say.

39. Any exploration program which "just happens" to include a new launch vehicle is, de facto, a launch vehicle program.

39. (alternate formulation) The three keys to keeping a new manned space program affordable and on schedule:
       1)  No new launch vehicles.
       2)  No new launch vehicles.
       3)  Whatever you do, don’t develop any new launch vehicles.

40. (McBryan’s Law) You can’t make it better until you make it work.

41. Space is a completely unforgiving environment. If you screw up the engineering, somebody dies (and there’s no partial credit because most of the analysis was right…)

I left the original as promised for for software projects I would re-cast #1 to read:

1. Software Engineering is based on user feedback. Analysis without user feedback is fantasy (yours).


I first saw this in a tweet by Neal Richter.

Digital Cartography [87]

Fri, 02/13/2015 - 00:44


Topic Maps

Digital Cartography [87] by Tiago Veloso.

Tiago has collected twenty-two (22) interactive maps that cover everything from “Why Measles May Just Be Getting Started | Bloomberg Visual Data” and “A History of New York City Basketball | NBA” (includes early stars as well) to “Map of 73 Years of Lynchings | The New York Times” and “House Vote 58 – Repeals Affordable Care Act | The New York Times.”

Sad to have come so far and yet not so far. Rather than a mob we have Congress, special interest groups and lobbyists. Rather than lynchings, everyone outside of the top 5% or so becomes poorer, less healthy, more stressed and more disposable. But we have a “free market” Shouting that at Galgotha would not have been much comfort.

BHO – British History Online

Fri, 02/13/2015 - 00:15


Topic Maps

BHO – British History Online

The “news” from 8 December 2014 (that I missed) reports:

British History Online (BHO) is pleased to launch version 5.0 of its website. Work on the website redevelopment began in January 2014 and involved a total rebuild of the BHO database and a complete redesign of the site. We hope our readers will find the new site easier to use than ever before. New features include:

  • A new search interface that allows you to narrow your search results by place, period, source type or subject.
  • A new catalogue interface that allows you to see our entire catalogue at a glance, or to browse by place, period, source type or subject.
  • Three subject guides on local history, parliamentary history and urban history. We are hoping to add more subject guides throughout the year. If you would like to contribute, contact us.
  • Guidelines on using BHO, which include searching and browsing help, copyright and citation information, and a list of external resources that we hope will be useful to readers.
  • A new about page that includes information about our team, past and present, as well as a history of where we have come from and where we want to go next.
  • A new subscription interface (at last!) which includes three new levels of subscription in addition to the usual premium content subscription: gold subscription, which includes access to page scans and five- and ten-year long-term BHO subscriptions.
  • Increased functionality to the maps interface, which are now fully zoomable and can even go full screen. We have also replaced the old map scans with high-quality versions.
  • We also updated the site with a fresh, new look! We aimed for easy-to-read text, clear navigation, clean design and bright new images.

​Version 5.0 has been a labour of love for the entire BHO team, but we have to give special thanks to Martin Steer, our tireless website manager who rebuilt the site from the ground up.

For over a decade, you have turned to BHO for reliable and accessible sources for the history of Britain and Ireland. We started off with 29 publications in 2003 and here is where we are now:

  • 1.2 million page views per month
  • 365,000 sessions per month
  • 1,241 publications
  • 108,227 text files
  • 184,355 images
  • 10,380 maps​

​We are very grateful to our users who make this kind of development possible. Your support allows BHO to always be growing and improving. 2014 has been a busy year for BHO and 2015 promises to be just as busy. Version 5.0 was a complete rebuild of BHO. We stripped the site down and began rebuilding from scratch. The goal of the new site is to make it as easy as possible for you to find materials relevant to your research. The new site was designed to be able to grow and expand easily, while always preserving the most important features of BHO. Read about our plans for 2015 and beyond.

We’d love to hear your feedback on our new site! If you want to stay up-to-date on what we are doing at BHO, follow us on Twitter.

Subscriptions are required for approximately 20% of the content, which enables the BHO to offer the other 80% for free.

A resource such as the BHO is a joyful reminder that not all projects sanctioned by government and its co-conspirators are venal and ill-intended.

For example, can you imagine a secondary school research paper on the Great Fire of 1666 that includes observations based on Leake’s Survey of the City After the Great Fire of 1666 Engraved By W. Hollar, 1667? With additional references from BHO materials?

I would have struck a Faustian bargain in high school had such materials been available!

That is just one treasure among many.

Teachers of English, history, humanities, etc., take note!

I first saw this in a tweet by Institute of Historical Research, U. of London.

Pot (U.S.) Calls Kettle (China) Black [Backdoors/Keys]

Thu, 02/12/2015 - 20:32


Topic Maps

Swati Khandelwal in China Demands Tech Companies to give them Backdoor and Encryption Keys misses a delicious irony she writes:

In May 2014, Chinese government announced that it will roll out a new set of regulations for IT hardware and software being sold to key industries in their country. China have repeatedly blamed U.S. products and criticize that U.S. products are itself threat to national security, as they may also contain NSA backdoors, among other things.

The New York Times article that she quotes, New Rules in China Upset Western Tech Companies by Paul Mozur, points out that:

The United States has made it virtually impossible for Huawei, a major Chinese maker of computer servers and cellphones, to sell its products in the United States, arguing that its equipment could have “back doors” for the Chinese government.

Which is more amazing?

  • The U.S. has secretly had and wants to continue to have “backdoors” into software for surveillance purposes and objects to China mandating the existence of such “backdoors” openly. Or,
  • It took the Snowden revelations for the Chinese government to realize they used binary software from the U.S. at their peril?

I’m really hard pressed to choose between the two. Most of us have assumed for years (decades?) that binary software of any source was a security risk. Or as Mr. Weasley says to Ginny in Harry Potter and The Chamber of Secrets:

Never trust anything that can think for itself if you can’t see where it keeps its brain. (emphasis added)

Despite my doubt about artificial intelligence, software does perform actions without its users knowledge or permission and binary code makes it impossible for a user to discover those actions. What if an ftp client, upon successful authentication, uploads the same file to two separate locations? One chosen by the user and another in the background? The only notice the user has is of the visible upload and has no easy way to detect the additional upload. On *nix systems it would be easy to detect if the user knew what to look for but the vast majority of handlers of secure data aren’t on *nix systems.

The bottom line on binary files is: you can’t see where it keeps its brain.

At least China, reportedly, no source pointed to the new regulations or other documents, is going to require “backdoors” plus source code. Verifying a vendor installed “backdoor” should not be difficult but knowing whether there are other “backdoors,” requires the source code. So +1 to China for realizing that without source code, conforming software may have one (1) or more “backdoors.”

Swati Khandelwal goes on to quote a communication (no link for the source) from the U.S. Chamber of Commerce and others:

An overly broad, opaque, discriminatory approach to cybersecurity policy that restricts global internet and ICT products and services would ultimately isolate Chinese ICT firms from the global marketplace and weaken cybersecurity, thereby harming China’s economic growth and development and restricting customer choice

Sorry, that went by a little quickly, let’s try that again (repeat):

An overly broad, opaque, discriminatory approach to cybersecurity policy that restricts global internet and ICT products and services would ultimately isolate Chinese ICT firms from the global marketplace and weaken cybersecurity, thereby harming China’s economic growth and development and restricting customer choice

Even after the third or fourth reading, the U.S. Chamber of Commerce position reads like gibberish.

How requiring “backdoors” and source code is “discriminatory” isn’t clear. Vendors can sell their software with a Chinese “backdoor” built in worldwide. Just as they have done with software with United States “backdoors.”

I suppose there is some additional burden on vendors who have U.S. “backdoors” but not ones for China. But there is some cost to entering any market.

There is a solution that avoids “backdoors” for all, enables better enforcement of intellectual property rights, and results in a better global Internet and ICT products and services market.

The name of that solution is: Public Open Source.

Think about it for a minute. Public open source does not mean that you have a license to compile and run the code. It certainly doesn’t mean that you can sell the code. It does mean you can read the source code. As in create other products that work with that source code.

If a country were to require posting of source code for all products sold in that country, then detection of software piracy would be nearly trivial. The source code of all software products is posted for public searching and analysis. Vendors can run check-sums on software installations to verify that their software key was used to compile software. Software that doesn’t match the check-sum should be presumed to be pirated.

Posting source code for commercial software would enhance the IP protection of software, while at the same time making it possible to avoid U.S., Chinese or any other “backdoors” that may exist in binary software.


China requiring public posting of source code results in these benefits:

  • Greater IP protection
  • Improved software security
  • Easier creation of interoperable add-on software products

What is there to not like about a public open source position for China?

PS: Public Open Source doesn’t answer China’s desire for software “backdoors.” I would urge China to pursue “backdoors” on a one-off basis to avoid the big data trap that now mires U.S. security agencies. The NSA has yet to identify a single terrorist from telephone records going back for years. If China has “backdoors” in all software/hardware, it will fall into the same trap.

If something happens and in hind sight a “backdoor” could have found it, the person who could have accessed the “backdoor” will be punished. Best defense, collect all the data from all the “backdoors” so we don’t miss anything.

If we delete any “backdoor” data and it turns out it was important, we will be punished. Best defense, let’s store all the “backdoor” data, forever.

Upon request we have to search the “backdoor” data, etc. You see where this is going. You will have so much data that the number of connections will overwhelm any information system and your ability to make use of the data.

A better solution has two parts. First, using the public open source, design your own “backdoors.” Vendors can’t betray you. Second, use “backdoors” only in cases of ongoing and focused investigations. Requiring current investigations means you will have contextual information to validate and coordinate with the data from “backdoors.”

China can spend its funds on supporting open source projects that create economic opportunity and growth or on bloated and largely ineffectual security apparatus collecting data from “backdoors.” I am confident it will choose wisely.

Unsustainable Museum Data

Sat, 01/31/2015 - 01:46


Topic Maps

Unsustainable Museum Data by Matthew Lincoln.

From the post:

In which I ask museums to give less API, more KISS and LOCKSS, please.

“How can we ensure our [insert big digital project title here] is sustainable?” So goes the cry from many a nascent digital humanities project, and rightly so! We should be glad that many new ventures are starting out by asking this question, rather than waiting until the last minute to come up with a sustainability plan. But Adam Crymble asks whether an emphasis on web-based digital projects instead of producing and sharing static data files is needlessly worsening our sustainability problem. Rather than allowing users to download the underlying data files (a passel of data tables, or marked-up text files, or even serialized linked data), these web projects mediate those data with user interfaces and guided searching, essentially making the data accessible to the casual user. But serving data piecemeal to users has its drawbacks, notes Crymble. If and when the web server goes down, access to the data disappears:

When something does go wrong we quickly realise it wasn’t the website we needed. It was the data, or it was the functionality. The online element, which we so often see as an asset, has become a liability.

I would broaden the scope of this call to include library and other data as well. Yes, APIs can be very useful but so can a copy of the original data.

Matthew mentions “creative re-use” near the end of his post but I would highlight that as a major reason for providing the original data. No doubt museums and others work very hard at offering good APIs of data but any API is only one way to obtain and view data.

For data, any data, to reach its potential, it needs to be available for multiple views of the same data. Some you may think are better, some you may think are worse than the original. But it is the potential for a multiplicity of views that opens up those possibilities. Keeping data behind an API is an act of preventing data from reaching its potential.