Another word for it

Subscribe to Another word for it feed
Updated: 1 day 9 hours ago

Our Favorite Maps of the Year Cover Everything From Bayous to Bullet Trains

Sun, 12/21/2014 - 01:48

Categories:

Topic Maps

Our Favorite Maps of the Year Cover Everything From Bayous to Bullet Trains by Greg Miller (Wired MapLab)

From the post:

What makes a great map? It depends, of course, on who’s doing the judging. Teh internetz loves a map with dazzling colors and a simple message, preferably related to some pop-culture phenomenon. Professional mapmakers love a map that’s aesthetically pleasing and based on solid principles of cartographic design.

We love maps that have a story to tell, the kind of maps where the more you look the more you see. Sometimes we fall for a map mostly because of the data behind it. Sometimes, we’re not ashamed to say, we love a map just for the way it looks. Here are some of the maps we came across this year that captivated us with their brains, their beauty, and in many cases, both.

First, check out the animated map below to see a day’s worth of air traffic over the UK, then toggle the arrow at top right to see the rest of the maps in fullscreen mode.

The “arrow at top right” refers to an arrow that appears when you mouse over the map of the United States at the top of the post. An impressive collection of maps!

For an even more impressive display of air traffic:

Bear in mind that there are approximately 93,000 flights per day, zero (0) of which are troubled by terrorists. The next time your leaders decry terrorism, do remember to ask where?

Creating Tor Hidden Services With Python

Sun, 12/21/2014 - 01:28

Categories:

Topic Maps

Creating Tor Hidden Services With Python by Jordan Wright.

From the post:

Tor is often used to protect the anonymity of someone who is trying to connect to a service. However, it is also possible to use Tor to protect the anonymity of a service provider via hidden services. These services, operating under the .onion TLD, allow publishers to anonymously create and host content viewable only by other Tor users.

The Tor project has instructions on how to create hidden services, but this can be a manual and arduous process if you want to setup multiple services. This post will show how we can use the fantastic stem Python library to automatically create and host a Tor hidden service.

If you are interested in the Tor network, this is a handy post to bookmark.

I was thinking about exploring the Tor network in the new year but you should be aware of a more recent post by Jordan:

What Happens if Tor Directory Authorities Are Seized?

From the post:

The Tor Project has announced that they have received threats about possible upcoming attempts to disable the Tor network through the seizure of Directory Authority (DA) servers. While we don’t know the legitimacy behind these threats, it’s worth looking at the role DA’s play in the Tor network, showing what effects their seizure could have on the Tor network.*

Nothing to panic about, yet, but if you know anyone you can urge to protect Tor, do so.

Mapazonia (Mapping the Amazon)

Sun, 12/21/2014 - 00:55

Categories:

Topic Maps

Mapazonia (Mapping the Amazon)

From the about page:

Mapazonia has the aim of improve the OSM data in the Amazon region, using satellite images to map roads and rivers geometry.

A detailed cartography will help many organizations that are working in the Amazon to accomplish their objectives. Together we can collaborate to look after the Amazon and its inhabitants.

The project was born as an initiative of the Latinamerican OpenStreetMap Community with the objective of go ahead with collaborative mapping of common areas and problems in the continent.

We use the Tasking Manager of the Humanitarian OpenStreetMap Team to define the areas where we are going to work. Furthermore we will organize mapathons to teach the persons how to use the tools of collaborative mapping.

Normally I am a big supporter of mapping and especially crowd-sourced mapping projects.

However, a goal of an improved mapping of the Amazon makes me wonder who benefits from such a map?

The local inhabitants have known their portions of the Amazon for centuries well enough for their purposes. So I don’t think they are going to benefit from such a map for their day to day activities.

Hmmm, hmmm, who else might benefit from such a map? I haven’t seen any discussion of that topic in the mailing list archives. There seems to be a great deal of enthusiasm for the project, which is a good thing, but little awareness of potential future uses.

Who uses maps of as of yet not well mapped places? Oil, logging, and mining companies, just to name of few of the more pernicious users of maps that come to mind.

To say that the depredations of such users will be checked by government regulations is a jest too cruel for laughter.

There is a valid reason why maps were historically considered as military secrets. One’s opponent could use them to better plan their attacks.

An accurate map of the Amazon will be putting the Amazon directly in the cross-hairs of multiple attackers, with no effective defenders in sight. The Amazon may become as polluted as some American waterways but being unmapped will delay that unhappy day.

I first saw this in a tweet by Alex Barth.

Leading from the Back: Making Data Science Work at a UX-driven Business

Sun, 12/21/2014 - 00:17

Categories:

Topic Maps

Leading from the Back: Making Data Science Work at a UX-driven Business by John Foreman. (Microsoft Visiting Speaker Series)

The first thirty (30) minutes are easily the best ones I have spent on a video this year. (I haven’t finished the Q&A part yet.)

John is a very good speaker but in part his presentation is fascinating because it illustrates how to “sell” data analysis to customers (internal and external).

You will find that while John can do the math, he is also very adept at delivering value to his customer.

Not surprisingly, customers are less interested in bells and whistles or your semantic religion and more interested in value as they perceive it.

Catch the switch in point of view, it isn’t value from your point of view but the customer’s point of view.

You need to set aside some time to watch at least the first thirty minutes of this presentation.

BTW, John Foreman is the author of Data Smart, which he confesses is “not sexy.”

I first saw this in a tweet by Microsoft Research.

Teaching Deep Convolutional Neural Networks to Play Go

Sat, 12/20/2014 - 19:38

Categories:

Topic Maps

Teaching Deep Convolutional Neural Networks to Play Go by Christopher Clark and Amos Storkey.

Abstract:

Mastering the game of Go has remained a long standing challenge to the field of AI. Modern computer Go systems rely on processing millions of possible future positions to play well, but intuitively a stronger and more ‘humanlike’ way to play the game would be to rely on pattern recognition abilities rather then brute force computation. Following this sentiment, we train deep convolutional neural networks to play Go by training them to predict the moves made by expert Go players. To solve this problem we introduce a number of novel techniques, including a method of tying weights in the network to ‘hard code’ symmetries that are expect to exist in the target function, and demonstrate in an ablation study they considerably improve performance. Our final networks are able to achieve move prediction accuracies of 41.1% and 44.4% on two different Go datasets, surpassing previous state of the art on this task by significant margins. Additionally, while previous move prediction programs have not yielded strong Go playing programs, we show that the networks trained in this work acquired high levels of skill. Our convolutional neural networks can consistently defeat the well known Go program GNU Go, indicating it is state of the art among programs that do not use Monte Carlo Tree Search. It is also able to win some games against state of the art Go playing program Fuego while using a fraction of the play time. This success at playing Go indicates high level principles of the game were learned.

If you are going to pursue the study of Monte Carlo Tree Search for semantic purposes, there isn’t any reason to not enjoy yourself as well.

And following the best efforts in game playing will be educational as well.

I take the efforts at playing Go by computer as well as those for chess, as indicating how far ahead humans are to AI.

Both of those two-player, complete knowledge games were mastered long ago by humans. Multi-player games with extended networds of influence and motives, not to mention incomplete information as well, seem securely reserved for human players for the foreseeable future. (I wonder if multi-player scenarios are similar to the multi-body problem in physics? Except with more influences.)

I first saw this in a tweet by Ebenezer Fogus.

Monte-Carlo Tree Search for Multi-Player Games [Semantics as Multi-Player Game]

Sat, 12/20/2014 - 19:25

Categories:

Topic Maps

Monte-Carlo Tree Search for Multi-Player Games by Joseph Antonius Maria Nijssen.

From the introduction:

The topic of this thesis lies in the area of adversarial search in multi-player zero-sum domains, i.e., search in domains having players with conflicting goals. In order to focus on the issues of searching in this type of domains, we shift our attention to abstract games. These games provide a good test domain for Artificial Intelligence (AI). They offer a pure abstract competition (i.e., comparison), with an exact closed domain (i.e., well-defined rules). The games under investigation have the following two properties. (1) They are too complex to be solved with current means, and (2) the games have characteristics that can be formalized in computer programs. AI research has been quite successful in the field of two-player zero-sum games, such as chess, checkers, and Go. This has been achieved by developing two-player search techniques. However, many games do not belong to the area where these search techniques are unconditionally applicable. Multi-player games are an example of such domains. This thesis focuses on two different categories of multi-player games: (1) deterministic multi-player games with perfect information and (2) multi-player hide-and-seek games. In particular, it investigates how Monte-Carlo Tree Search can be improved for games in these two categories. This technique has achieved impressive results in computer Go, but has also shown to be beneficial in a range of other domains.

This chapter is structured as follows. First, an introduction to games and the role they play in the field of AI is provided in Section 1.1. An overview of different game properties is given in Section 1.2. Next, Section 1.3 defines the notion of multi-player games and discusses the two different categories of multi-player games that are investigated in this thesis. A brief introduction to search techniques for two-player and multi-player games is provided in Section 1.4. Subsequently, Section 1.5 defines the problem statement and four research questions. Finally, an overview of this thesis is provided in Section 1.6.

This thesis is great background reading on the use of Monte-Carol tree search in games. While reading the first chapter, I realized that assigning semantics to a token is an instance of a multi-player game with hidden information. That is the “semantic” of any token doesn’t exist in some Platonic universe but rather is the result of some N number of players who also accept a particular semantic for some given token in a particular context. And we lack knowledge of the semantic and the reasons for it that will be assigned by some N number of players, which may change over time and context.

The semiotic triangle of Ogden and Richards (The Meaning of Meaning):

for any given symbol, represents the view of a single speaker. But as Ogden and Richards note, what is heard by listeners should be represented by multiple semiotic triangles:

Normally, whenever we hear anything said we spring spontaneously to an immediate conclusion, namely, that the speaker is referring to what we should be referring to were we speaking the words ourselves. In some cases this interpretation may be correct; this will prove to be what he has referred to. But in most discussions which attempt greater subtleties than could be handled in a gesture language this will not be so. (The Meaning of Meaning, page 15 of the 1923 edition)

Is RDF/OWL more subtle than can be handled by a gesture language? If you think so then you have discovered one of the central problems with the Semantic Web and any other universal semantic proposal.

Not that topic maps escape a similar accusation, but with topic maps you can encode additional semiotic triangles in an effort to avoid confusion, at least to the extent of funding and interest. And if you aren’t trying to avoid confusion, you can supply semiotic triangles that reach across understandings to convey additional information.

You can’t avoid confusion altogether nor can you achieve perfect communication with all listeners. But, for some defined set of confusions or listeners, you can do more than simply repeat your original statements in a louder voice.

Whether Monte-Carlo Tree searches will help deal with the multi-player nature of semantics isn’t clear but it is an alternative to repeating “…if everyone would use the same (my) system, the world would be better off…” ad nauseam.

I first saw this in a tweet by Ebenezer Fogus.

Linked Open Data Visualization Revisited: A Survey

Sat, 12/20/2014 - 16:48

Categories:

Topic Maps

Linked Open Data Visualization Revisited: A Survey by Oscar Peña, Unai Aguilera and Diego López-de-Ipiña.

Abstract:

Mass adoption of the Semantic Web’s vision will not become a reality unless the benefits provided by data published under the Linked Open Data principles are understood by the majority of users. As technical and implementation details are far from being interesting for lay users, the ability of machines and algorithms to understand what the data is about should provide smarter summarisations of the available data. Visualization of Linked Open Data proposes itself as a perfect strategy to ease the access to information by all users, in order to save time learning what the dataset is about and without requiring knowledge on semantics.

This article collects previous studies from the Information Visualization and the Exploratory Data Analysis fields in order to apply the lessons learned to Linked Open Data visualization. Datatype analysis and visualization tasks proposed by Ben Shneiderman are also added in the research to cover different visualization features.

Finally, an evaluation of the current approaches is performed based on the dimensions previously exposed. The article ends with some conclusions extracted from the research.

I would like to see a version of this article after it has had several good editing passes. From the abstract alone, “…benefits provided by data…” and “…without requiring knowledge on semantics…” strike me as extremely problematic.

Data, accessible or not, does not provide benefits. The results of processing data may, which may explain the lack of enthusiasm when large data dumps are made web accessible. In and of itself, it is just another large dump of data. The results of processing that data may be very useful, but that is another step in the process.

I don’t think “…without requiring knowledge of semantics…” is in line with the rest of the article. I suspect the authors meant the semantics of data sets could be conveyed to users without their researching them prior to using the data set. I think that is problematic but it has the advantage of being plausible.

The various theories of visualization and datatypes (pages 3-8) don’t seem to advance the discussion and I would either drop that content or tie it into the actual visualization suites discussed. It’s educational but its relationship to the rest of the article is tenuous.

The coverage of visualization suites is encouraging and useful, but with an overall tighter focus, more time could be spent on each one and their entries being correspondingly longer.

Hopefully we will see a later, edited version of this paper as a good summary/guide to visualization tools for linked data would be a useful addition to the literature.

I first saw this in a tweet by Marin Dimitrov.

BigDataScript: a scripting language for data pipelines

Sat, 12/20/2014 - 01:34

Categories:

Topic Maps

BigDataScript: a scripting language for data pipelines by Pablo Cingolani, Rob Sladek, and Mathieu Blanchette.

Abstract:

Motivation: The analysis of large biological datasets often requires complex processing pipelines that run for a long time on large computational infrastructures. We designed and implemented a simple script-like programming language with a clean and minimalist syntax to develop and manage pipeline execution and provide robustness to various types of software and hardware failures as well as portability.

Results: We introduce the BigDataScript (BDS) programming language for data processing pipelines, which improves abstraction from hardware resources and assists with robustness. Hardware abstraction allows BDS pipelines to run without modification on a wide range of computer architectures, from a small laptop to multi-core servers, server farms, clusters and clouds. BDS achieves robustness by incorporating the concepts of absolute serialization and lazy processing, thus allowing pipelines to recover from errors. By abstracting pipeline concepts at programming language level, BDS simplifies implementation, execution and management of complex bioinformatics pipelines, resulting in reduced development and debugging cycles as well as cleaner code.

Availability and implementation: BigDataScript is available under open-source license at http://pcingola.github.io/BigDataScript.

How would you compare this pipeline proposal to: XProc 2.0: An XML Pipeline Language?

I prefer XML solutions because I can reliably point to an element or attribute to endow it with explicit semantics.

While explicit semantics is my hobby horse, it may not be yours. Curious how you view this specialized language for bioinformatics pipelines?

I first saw this in a tweet by Pierre Lindenbaum.

DeepSpeech: Scaling up end-to-end speech recognition [Is Deep the new Big?]

Fri, 12/19/2014 - 22:18

Categories:

Topic Maps

DeepSpeech: Scaling up end-to-end speech recognition by Awni Hannun, et al.

Abstract:

We present a state-of-the-art speech recognition system developed using end-to-end deep learning. Our architecture is significantly simpler than traditional speech systems, which rely on laboriously engineered processing pipelines; these traditional systems also tend to perform poorly when used in noisy environments. In contrast, our system does not need hand-designed components to model background noise, reverberation, or speaker variation, but instead directly learns a function that is robust to such effects. We do not need a phoneme dictionary, nor even the concept of a “phoneme.” Key to our approach is a well-optimized RNN training system that uses multiple GPUs, as well as a set of novel data synthesis techniques that allow us to efficiently obtain a large amount of varied data for training. Our system, called DeepSpeech, outperforms previously published results on the widely studied Switchboard Hub5’00, achieving 16.5% error on the full test set. DeepSpeech also handles challenging noisy environments better than widely used, state-of-the-art commercial speech systems.

Although the academic papers, so far, are using “deep learning” in a meaningful sense, early 2015 is likely to see many vendors rebranding their offerings as incorporating or being based on deep learning.

When approached with any “deep learning” application or service, check out the Internet Archive WayBack Machine to see how they were marketing their software/service before “deep learning” became popular.

Is there a GPU-powered box in your future?

I first saw this in a tweet by Andrew Ng.

Update: After posting I encountered: Baidu claims deep learning breakthrough with Deep Speech by Derrick Harris. Talks to Andrew Ng, great write-up.

The top 10 Big data and analytics tutorials in 2014

Fri, 12/19/2014 - 21:31

Categories:

Topic Maps

The top 10 Big data and analytics tutorials in 2014 by Sarah Domina.

From the post:

At developerWorks, our Big data and analytics content helps you learn to leverage the tools and technologies to harness and analyze data. Let’s take a look back at the top 10 tutorials from 2014, in no particular order.

There are a couple of IBM product line specific tutorials but the majority of them you will enjoy whether you are an IBM shop or not.

Oddly enough, the post for the top ten (10) in 2014 was made on 26 September 2014.

Either Watson is far better than I have ever imagined or IBM has its own calendar.

In favor of an IBM calendar, I would point out that IBM has its own song.

A flag:

IBM ranks ahead of Morocco in terms of GDP at $99.751 billion.

Does IBM have its own calendar? Hard to say for sure but I would not doubt it.

Collection of CRS reports released to the public

Fri, 12/19/2014 - 21:07

Categories:

Topic Maps

Collection of CRS reports released to the public by Kevin Kosar.

From the post:

Something rare has occurred—a collection of reports authored by the Congressional Research Service has been published and made freely available to the public. The 400-page volume, titled, “The Evolving Congress,” and was produced in conjunction with CRS’s celebration of its 100th anniversary this year. Congress, not CRS, published it. (Disclaimer: Before departing CRS in October, I helped edit a portion of the volume.)

The Congressional Research Service does not release its reports publicly. CRS posts its reports at CRS.gov, a website accessible only to Congress and its staff. The agency has a variety of reasons for this policy, not least that its statute does not assign it this duty. Congress, with ease, could change this policy. Indeed, it already makes publicly available the bill digests (or “summaries”) CRS produces at Congress.gov.

The Evolving Congress” is a remarkable collection of essays that cover a broad range of topic. Readers would be advised to start from the beginning. Walter Oleszek provides a lengthy essay on how Congress has changed over the past century. Michael Koempel then assesses how the job of Congressman has evolved (or devolved depending on one’s perspective). “Over time, both Chambers developed strategies to reduce the quantity of time given over to legislative work in order to accommodate Members’ other duties,” Koempel observes.

The NIH (National Institutes of Health) requires that NIH funded research be made available to the public. Other government agencies are following suite. Isn’t it time for the Congressional Research Service to make its publicly funded research available to the public that paid for it?

Congress needs to require it. Contact your member of Congress today. Ask for all Congressional Research Service reports, past, present and future be made available to the public.

You have already paid for the reports, why shouldn’t you be able to read them?

Senate Joins House In Publishing Legislative Information In Modern Formats [No More Sneaking?]

Fri, 12/19/2014 - 20:29

Categories:

Topic Maps

Senate Joins House In Publishing Legislative Information In Modern Formats by Daniel Schuman.

From the post:

There’s big news from today’s Legislative Branch Bulk Data Task Force meeting. The United States Senate announced it would begin publishing text and summary information for Senate legislation, going back to the 113th Congress, in bulk XML. It would join the House of Representatives, which already does this. Both chambers also expect to have bill status information available online in XML format as well, but a little later on in the year.

This move goes a long way to meet the request made by a coalition of transparency organizations, which asked for legislative information be made available online, in bulk, in machine-processable formats. These changes, once implemented, will hopefully put an end to screen scraping and empower users to build impressive tools with authoritative legislative data. A meeting to spec out publication methods will be hosted by the Task Force in late January/early February.

The Senate should be commended for making the leap into the 21st century with respect to providing the American people with crucial legislative information. We will watch closely to see how this is implemented and hope to work with the Senate as it moves forward.

In addition, the Clerk of the House announced significant new information will soon be published online in machine-processable formats. This includes data on nominees, election statistics, and members (such as committee assignments, bioguide IDs, start date, preferred name, etc.) Separately, House Live has been upgraded so that all video is now in H.264 format. The Clerk’s website is also undergoing a redesign.

The Office of Law Revision Counsel, which publishes the US Code, has further upgraded its website to allow pinpoint citations for the US Code. Users can drill down to the subclause level simply by typing the information into their search engine. This is incredibly handy.

This is great news!

Law is a notoriously opaque domain and the process of creating it even more so. Getting the data is a great first step, parsing out steps in the process and their meaning is another. To say nothing of the content of the laws themselves.

Still, progress is progress and always welcome!

Perhaps citizen review will stop the Senate from sneaking changes past sleepy members of the House.

New in Cloudera Labs: SparkOnHBase

Fri, 12/19/2014 - 19:59

Categories:

Topic Maps

New in Cloudera Labs: SparkOnHBase by Ted Malaska.

From the post:

Apache Spark is making a huge impact across our industry, changing the way we think about batch processing and stream processing. However, as we progressively migrate from MapReduce toward Spark, we shouldn’t have to “give up” anything. One of those capabilities we need to retain is the ability to interact with Apache HBase.

In this post, we will share the work being done in Cloudera Labs to make integrating Spark and HBase super-easy in the form of the SparkOnHBase project. (As with everything else in Cloudera Labs, SparkOnHBase is not supported and there is no timetable for possible support in the future; it’s for experimentation only.) You’ll learn common patterns of HBase integration with Spark and see Scala and Java examples for each. (It may be helpful to have the SparkOnHBase repository open as you read along.)

Is it too late to amend my wish list to include an eighty-hour week with Spark?

This is an excellent opportunity to follow along with lab quality research on an important technology.

The Cloudera Labs discussion group strikes me as dreadfully under used.

Enjoy!

XProc 2.0: An XML Pipeline Language

Fri, 12/19/2014 - 17:07

Categories:

Topic Maps

XProc 2.0: An XML Pipeline Language W3C First Public Working Draft 18 December 2014

Abstract:

This specification describes the syntax and semantics of XProc 2.0: An XML Pipeline Language, a language for describing operations to be performed on documents.

An XML Pipeline specifies a sequence of operations to be performed on documents. Pipelines generally accept documents as input and produce documents as output. Pipelines are made up of simple steps which perform atomic operations on documents and constructs similar to conditionals, iteration, and exception handlers which control which steps are executed.

For your proofing responses:

Please report errors in this document by raising issues on the specification
repository
. Alternatively, you may report errors in this document to the public mailing list public-xml-processing-model-comments@w3.org (public archives are available).

First drafts always need a close reading for omissions and errors. However, after looking at the editors of XProc 2.0, you aren’t likely to find any “cheap” errors. Makes proofing all the more fun.

Enjoy!

XQuery, XPath, XQuery/XPath Functions and Operators 3.1

Fri, 12/19/2014 - 16:56

Categories:

Topic Maps

XQuery, XPath, XQuery/XPath Functions and Operators 3.1 were published on 18 December 2014 as a call for implementation of these specifications.

The changes most often noted were the addition of capabilities for maps and arrays. “Support for JSON” means sections 17.4 and 17.5 of XPath and XQuery Functions and Operators 3.1.

XQuery 3.1 and XPath 3.1 depend on XPath and XQuery Functions and Operators 3.1 for JSON support. (Is there no acronym for XPath and XQuery Functions and Operators? Suggest XF&O.)

For your reading pleasure:

XQuery 3.1: An XML Query Language

    3.10.1 Maps.

    3.10.2 Arrays.

XML Path Language (XPath) 3.1

  1. 3.11.1 Maps
  2. 3.11.2 Arrays

XPath and XQuery Functions and Operators 3.1

  1. 17.1 Functions that Operate on Maps
  2. 17.3 Functions that Operate on Arrays
  3. 17.4 Conversion to and from JSON
  4. 17.5 Functions on JSON Data

Hoping that your holiday gifts include a large box of highlighters and/or a box of red pencils!

Oh, these specifications will “…remain as Candidate Recommendation(s) until at least 13 February 2015. (emphasis added)”

Less than two months so read quickly and carefully.

Enjoy!

I first saw this in a tweet by Jonathan Robie.

The Top 10 Posts of 2014 from the Cloudera Engineering Blog

Fri, 12/19/2014 - 01:46

Categories:

Topic Maps

The Top 10 Posts of 2014 from the Cloudera Engineering Blog by Justin Kestelyn.

From the post:

Our “Top 10″ list of blog posts published during a calendar year is a crowd favorite (see the 2013 version here), in particular because it serves as informal, crowdsourced research about popular interests. Page views don’t lie (although skew for publishing date—clearly, posts that publish earlier in the year have pole position—has to be taken into account).

In 2014, a strong interest in various new components that bring real time or near-real time capabilities to the Apache Hadoop ecosystem is apparent. And we’re particularly proud that the most popular post was authored by a non-employee.

See Justin’s post for the top ten (10) list!

The Cloudera blog always has high quality content so this the cream of the crop!

Enjoy!

Announcing Apache Storm 0.9.3

Fri, 12/19/2014 - 01:32

Categories:

Topic Maps

Announcing Apache Storm 0.9.3 by Taylor Goetz

From the post:

With Apache Hadoop YARN as its architectural center, Apache Hadoop continues to attract new engines to run within the data platform, as organizations want to efficiently store their data in a single repository and interact with it for batch, interactive and real-time streaming use cases. Apache Storm brings real-time data processing capabilities to help capture new business opportunities by powering low-latency dashboards, security alerts, and operational enhancements integrated with other applications running in the Hadoop cluster.

Now there’s an early holiday surprise!

Enjoy!

GovTrack’s Summer/Fall Updates

Fri, 12/19/2014 - 01:14

Categories:

Topic Maps

GovTrack’s Summer/Fall Updates by Josh Tauberer.

From the post:

Here’s what’s been improved on GovTrack in the summer and fall of this year.

developers

  • Permalinks to individual paragraphs in bill text is now provided (example).
  • We now ask for your congressional district so that we can customize vote and bill pages to show how your Members of Congress voted.
  • Our bill action/status flow charts on bill pages now include activity on certain related bills, which are often crucially important to the main bill.
  • The bill cosponsors list now indicates when a cosponsor of a bill is no longer serving (i.e. because of retirement or death).
  • We switched to gender neutral language when referring to Members of Congress. Instead of “congressman/woman”, we now use “representative.”
  • Our historical votes database (1979-1989) from voteview.com was refreshed to correct long-standing data errors.
  • We dropped support for Internet Explorer 6 in order to address with POODLE SSL security vulnerability that plagued most of the web.
  • We dropped support for Internet Explorer 7 in order to allow us to make use of more modern technologies, which has always been the point of GovTrack.

The comment I posted was:

Great work! But I read the other day about legislation being “snuck” by the House (Senate changes), US Congress OKs ‘unprecedented’ codification of warrantless surveillance.

Do you have plans for a diff utility that warns members of either house of changes to pending legislation?

In case you aren’t familiar with GovTrack.us.

From the about page:

GovTrack.us, a project of Civic Impulse, LLC now in its 10th year, is one of the worldʼs most visited government transparency websites. The site helps ordinary citizens find and track bills in the U.S. Congress and understand their representatives’ legislative record.

In 2013, GovTrack.us was used by 8 million individuals. We sent out 3 million legislative update email alerts. Our embeddable widgets were deployed on more than 80 official websites of Members of Congress.

We bring together the status of U.S. federal legislation, voting records, congressional district maps, and more (see the table at the right).
and make it easier to understand. Use GovTrack to track bills for updates or get alerts about votes with email updates and RSS feeds. We also have unique statistical analyses to put the information in context. Read the «Analysis Methodology».

GovTrack openly shares the data it brings together so that other websites can build other tools to help citizens engage with government. See the «Developer Documentation» for more.