Another word for it

Subscribe to Another word for it feed
Updated: 1 day 2 hours ago

Know Your Algorithms and Data!

Sun, 09/21/2014 - 21:46

Categories:

Topic Maps

If you let me pick the algorithm or the data, I can produce any result you want.

Something to keep in mind when listening to reports of “facts.”

Or as Nietzsche would say:

There are no facts, only interpretations.

There are people who are so naive that they don’t realize interpretations other than their are possible. Avoid them unless you have need of followers for some reason.

I first saw this in a tweet by Chris Arnold.

Fixing Pentagon Intelligence ['data glut but an information deficit']

Sun, 09/21/2014 - 21:24

Categories:

Topic Maps

Fixing Pentagon Intelligence by John R. Schindler.

From the post:

The U.S. Intelligence Community (IC), that vast agglomeration of seventeen different hush-hush agencies, is an espionage behemoth without peer anywhere on earth in terms of budget and capabilities. Fully eight of those spy agencies, plus the lion’s share of the IC’s budget, belong to the Department of Defense (DoD), making the Pentagon’s intelligence arm something special. It includes the intelligence agencies of all the armed services, but the jewel in the crown is the National Security Agency (NSA), America’s “big ears,” with the National Geospatial-Intelligence Agency (NGA), which produces amazing imagery, following close behind.

None can question the technical capabilities of DoD intelligence, but do the Pentagon’s spies actually know what they are talking about? This is an important, and too infrequently asked, question. Yet it was more or less asked this week, in a public forum, by a top military intelligence leader. The venue was an annual Washington, DC, intelligence conference that hosts IC higher-ups while defense contractors attempt a feeding frenzy, and the speaker was Rear Admiral Paul Becker, who serves as the Director of Intelligence (J2) on the Joint Chiefs of Staff (JCS). A career Navy intelligence officer, Becker’s job is keeping the Pentagon’s military bosses in the know on hot-button issues: it’s a firehose-drinking position, made bureaucratically complicated because JCS intelligence support comes from the Defense Intelligence Agency (DIA), which is an all-source shop that has never been a top-tier IC agency, and which happens to have some serious leadership churn at present.

Admiral Becker’s comments on the state of DoD intelligence, which were rather direct, merit attention. Not surprisingly for a Navy guy, he focused on China. He correctly noted that we have no trouble collecting the “dots” of (alleged) 9/11 infamy, but can the Pentagon’s big battalions of intel folks actually derive the necessary knowledge from all those tasty SIGINT, HUMINT, and IMINT morsels? Becker observed — accurately — that DoD intelligence possesses a “data glut but an information deficit” about China, adding that “We need to understand their strategy better.” In addition, he rued the absence of top-notch intelligence analysts of the sort the IC used to possess, asking pointedly: “Where are those people for China? We need them.”

Admiral Becker’s:

data glut but an information deficit” (emphasis added)

captures the essence of phone record subpoenas, mass collection of emails, etc., all designed to give the impression of frenzied activity, with no proof of effectiveness. That is an “information deficit.”

Be reassured you can host a data glut in a topic map so topic maps per se are not a threat to current data gluts. It is possible, however, to use topic maps over existing data gluts to create information and actionable intelligence. Without disturbing the underlying data gluts and their contractors.

I tried to find a video of Adm. Becker’s presentation but apparently the Intelligence and National Security Security Summit 2014 does not provide video recording of presentations. Whether that is to prevent any contemporaneous record being kept of remarks or just being low-tech kinda folks isn’t clear.

I can point out the meeting did have a known liar, “The Honorable James Clapper,” on the agenda. Hard to know if having perjured himself in front of Congress has made him gun shy of recorded speeches or not. (For Clapper’s latest “spin,” on “the least untruthful,” see: James Clapper says he misspoke, didn’t lie about NSA surveillance.) One hopes by next year’s conference Clapper will appear as: James Clapper, former DNI, convicted felon, Federal Prison Register #….

If you are interested in intelligence issues, you should be following John R. Schindler. A U.S. perspective but handling issues in intelligence with topic maps will vary in the details but not the underlying principles from one intelligence service to another.

Disclosure: I rag on the intelligence services of the United States due to greater access to public information on those services. Don’t take that as greater interest how their operations could be improved by topic maps over other intelligence services.

I am happy to discuss how your intelligence services can (or can’t) be improved by topic maps. There are problems, such as those discussed by Admiral Becker, that can’t be fixed by using topic maps. I will be as quick to point those out as I will problems where topic maps are relevant. My goal is your satisfaction that topic maps made a difference for you, not having a government entity in a billing database.

A Closed Future for Mathematics?

Sun, 09/21/2014 - 20:18

Categories:

Topic Maps

A Closed Future for Mathematics? by Eric Raymond.

From the post:

In a blog post on Computational Knowledge and the Future of Pure Mathematics Stephen Wolfram lays out a vision that is in many ways exciting and challenging. What if all of mathematics could be expressed in a common formal notation, stored in computers so it is searchable and amenable to computer-assisted discovery and proof of new theorems?

… to be trusted, the entire system will need to be transparent top to bottom. The design, the data representations, and the implementation code for its software must all be freely auditable by third-party mathematical topic experts and mathematically literate software engineers.

Eric identifies three (3) types of errors that may exist inside the proposed closed system from Wolfram.

Is transparency of a Wolfram solution the only way to trust a Wolfram solution?

For any operation or series of operations performed with Wolfram software, you could perform the same operation in one or more open or closed source systems and see if the results agree. The more often they agree for some set of operations the greater your confidence in those operations with Wolfram software.

That doesn’t mean that the next operation or a change in the order of operations is going to produce a trustworthy result. Just that for some specified set of operations in a particular order with specified data that you obtained the same result from multiple software solutions.

It could be that all the software solutions implement the same incorrect algorithm, the same valid algorithm incorrectly, or errors in search engines searching a mathematical database (which could only be evaluated against the data being searched).

Where N is the number of non-Wolfram software packages you are using to check the Wolfram-based solution and W represents the amount of work to obtain a solution, the total work required is N x W.

In addition to not resulting in the trust Eric is describing, it is an increase in your workload.

I first saw this in a tweet by Michael Nielsen.

Medical Heritage Library (MHL)

Sun, 09/21/2014 - 15:48

Categories:

Topic Maps

Medical Heritage Library (MHL)

From the post:

The Medical Heritage Library (MHL) and DPLA are pleased to announce that MHL content can now be discovered through DPLA.

The MHL, a specialized research collection stored in the Internet Archive, currently includes nearly 60,000 digital rare books, serials, audio and video recordings, and ephemera in the history of medicine, public health, biomedical sciences, and popular medicine from the medical special collections of 22 academic, special, and public libraries. MHL materials have been selected through a rigorous process of curation by subject specialist librarians and archivists and through consultation with an advisory committee of scholars in the history of medicine, public health, gender studies, digital humanities, and related fields. Items, selected for their educational and research value, extend from 1235 (Liber Aristotil[is] de nat[u]r[a] a[nima]li[u]m ag[res]tium [et] marino[rum]), to 2014 (The Grog Issue 40 2014) with the bulk of the materials dating from the 19th century.

“The rich history of medicine content curated by the MHL is available for the first time alongside collections like those from the Biodiversity Heritage Library and the Smithsonian, and offers users a single access point to hundreds of thousands of scientific and history of science resources,” said DPLA Assistant Director for Content Amy Rudersdorf.

The collection is particularly deep in American and Western European medical publications in English, although more than a dozen languages are represented. Subjects include anatomy, dental medicine, surgery, public health, infectious diseases, forensics and legal medicine, gynecology, psychology, anatomy, therapeutics, obstetrics, neuroscience, alternative medicine, spirituality and demonology, diet and dress reform, tobacco, and homeopathy. The breadth of the collection is illustrated by these popular items: the United States Naval Bureau of Medical History’s audio oral history with Doctor Walter Burwell (1994) who served in the Pacific theatre during World War II and witnessed the first Japanese kamikaze attacks; History and medical description of the two-headed girl : sold by her agents for her special benefit, at 25 cents (1869), the first edition of Gray’s Anatomy (1858) (the single most-downloaded MHL text at more than 2,000 downloads annually), and a video collection of Hanna – Barbera Production Flintstones (1960) commercials for Winston cigarettes.

“As is clear from today’s headlines, science, health, and medicine have an impact on the daily lives of Americans,” said Scott H. Podolsky, chair of the MHL’s Scholarly Advisory Committee. “Vaccination, epidemics, antibiotics, and access to health care are only a few of the ongoing issues the history of which are well documented in the MHL. Partnering with the DPLA offers us unparalleled opportunities to reach new and underserved audiences, including scholars and students who don’t have access to special collections in their home institutions and the broader interested public.“

Quick links:

Digital Public Library of America

Internet Archive

Medical Heritage Library website

I remember the Flintstone commercials for Winston cigarettes. Not all that effective a campaign, I smoked Marboros (reds in a box) for almost forty-five (45) years.

As old vices die out, new ones, like texting and driving take their place. On behalf of current and former smokers, I am confident that smoking was not a factor in 1,600,000 accidents per year and 11 teen deaths every day.

Apache Lucene and Solr 4.10

Sun, 09/21/2014 - 15:07

Categories:

Topic Maps

Apache Lucene and Solr 4.10

From the post:

Today Apache Lucene and Solr PMC announced another version of Apache Lucene library and Apache Solr search server numbered 4.10. This is a next release continuing the 4th version of both Apache Lucene and Apache Solr.

Here are some of the changes that were made comparing to the 4.9:

Lucene
  • Simplified Version handling for analyzers
  • TermAutomatonQuery was added
  • Optimizations and bug fixes
Solr
  • Ability to automatically add replicas in SolrCloud mode in HDFS
  • Ability to export full results set
  • Distributed support for facet.pivot
  • Optimizations and bugfixes from Lucene 4.9

Full changes list for Lucene can be found at http://wiki.apache.org/lucene-java/ReleaseNote410. Full list of changes in Solr 4.10 can be found at: http://wiki.apache.org/solr/ReleaseNote410.

Apache Lucene 4.10 library can be downloaded from the following address: http://www.apache.org/dyn/closer.cgi/lucene/java/. Apache Solr 4.10 can be downloaded at the following URL address: http://www.apache.org/dyn/closer.cgi/lucene/solr/. Please remember that the mirrors are just starting to update so not all of them will contain the 4.10 version of Lucene and Solr.

A belated note about Apache Lucene and Solr 4.10.

I must have been distracted by the continued fumbling with the Ebola crisis. I no longer wonder how the international community would respond to an actual world wide threat. In a word, ineffectively.

WWW 2015 Call for Research Papers

Sun, 09/21/2014 - 01:18

Categories:

Topic Maps

WWW 2015 Call for Research Papers

From the webpage:

Important Dates:

  • Research track abstract registration:
    Monday, November 3, 2014 (23:59 Hawaii Standard Time)
  • Research track full paper submission:
    Monday, November 10, 2014 (23:59 Hawaii Standard Time)
  • Notifications of acceptance:
    Saturday, January 17, 2015
  • Final Submission Deadline for Camera-ready Version:
    Sunday, March 8, 2015
  • Conference dates:
    May 18 – 22, 2015

Research papers should be submitted through EasyChair at:
https://easychair.org/conferences/?conf=www2015

For more than two decades, the International World Wide Web (WWW) Conference has been the premier venue for researchers, academics, businesses, and standard bodies to come together and discuss latest updates on the state and evolutionary path of the Web. The main conference program of WWW 2015 will have 11 areas (or themes) for refereed paper presentations, and we invite you to submit your cutting-edge, exciting, new breakthrough work to the relevant area. In addition to the main conference, WWW 2015 will also have a series of co-located workshops, keynote speeches, tutorials, panels, a developer track, and poster and demo sessions.

The list of areas for this year is as follows:

  • Behavioral Analysis and Personalization
  • Crowdsourcing Systems and Social Media
  • Content Analysis
  • Internet Economics and Monetization
  • Pervasive Web and Mobility
  • Security and Privacy
  • Semantic Web
  • Social Networks and Graph Analysis
  • Web Infrastructure: Datacenters, Content Delivery Networks, and Cloud Computing
  • Web Mining
  • Web Search Systems and Applications

Great conference, great weather (weather for Florence in May) and it is in Florence, Italy. What other reasons do you need to attend?

Why news organizations need to invest in better linking and tagging

Sun, 09/21/2014 - 01:08

Categories:

Topic Maps

Why news organizations need to invest in better linking and tagging by Frédéric Filloux.

From the post:

Most media organizations are still stuck in version 1.0 of linking. When they produce content, they assign tags and links mostly to other internal content. This is done out of fear that readers would escape for good if doors were opened too wide. Assigning tags is not exact science: I recently spotted a story about the new pregnancy in the British royal family; it was tagged “demography,” as if it was some piece about Germany’s weak fertility rate.

But there is much more to come in that field. Two factors are are at work: APIs and semantic improvements. APIs (Application Programming Interfaces) act like the receptors of a cell that exchanges chemical signals with other cells. It’s the way to connect a wide variety of content to the outside world. A story, a video, a graph can “talk” to and be read by other publications, databases, and other “organisms.” But first, it has to pass through semantic filters. From a text, the most basic tools extract sets of words and expressions such as named entities, patronyms, places.

Another higher level involves extracting meanings like “X acquired Y for Z million dollars” or “X has been appointed finance minister.” But what about a video? Some go with granular tagging systems; others, such as Ted Talks, come with multilingual transcripts that provide valuable raw material for semantic analysis. But the bulk of content remains stuck in a dumb form: minimal and most often unstructured tagging. These require complex treatments to make them “readable” by the outside world. For instance, a untranscribed video seen as interesting (say a Charlie Rose interview) will have to undergo a speech-to-text analysis to become usable. This processes requires both human curation (finding out what content is worth processing) and sophisticated technology (transcribing a speech by someone speaking super-fast or with a strong accent.)

Great piece on the value of more robust tagging by news organizations.

Rather than tagging as an after-the-fact of publication activity, tagging needs to be part of the work flow that produces content. Tagging as a step in the process of content production avoids creating a mountain of untagged content.

To what end? Well, imagine simple tagging that associates a reporter with named sources in a report. When the subject of that report comes up in the future, wouldn’t it be a time saver to whistle up all the reporters on that subject with a list of their named contacts?

Never having worked in a newspaper I can’t say but that sounds like an advantage to an outsider.

That lesson can be broadened to any company producing content. The data in the content had a point of origin, it was delivered from someone, reported by someone else, etc. Capture those relationships and track the ebb and flow of your data and not just the values it represents.

I first saw this in a tweet by Marin Dimitrov.

Growing a Language

Sun, 09/21/2014 - 00:55

Categories:

Topic Maps

Growing a Language by Guy L. Steele, Jr.

The first paper in a new series of posts from the Hacker School blog, “Paper of the Week.”

I haven’t found a good way to summarize Steele’s paper but can observe that a central theme is the growth of programming languages.

While enjoying the Steele paper, ask yourself how would you capture the changing nuances of a language, natural or artificial?

Enjoy!

ApacheCon EU 2014

Sun, 09/21/2014 - 00:36

Categories:

Topic Maps

ApacheCon EU 2014

ApacheCon Europe 2014 – November 17-21 in Budapest, Hungary.

November is going to be here sooner than you think. You need to register now and start making travel arrangements.

A quick scroll down the schedule page will give you an idea of the breath of the Apache Foundation activities.

219 million stars: a detailed catalogue of the visible Milky Way

Sun, 09/21/2014 - 00:22

Categories:

Topic Maps

219 million stars: a detailed catalogue of the visible Milky Way

From the post:

A new catalogue of the visible part of the northern part of our home Galaxy, the Milky Way, includes no fewer than 219 million stars. Geert Barentsen of the University of Hertfordshire led a team who assembled the catalogue in a ten year programme using the Isaac Newton Telescope (INT) on La Palma in the Canary Islands. Their work appears today in the journal Monthly Notices of the Royal Astronomical Society.

The production of the catalogue, IPHAS DR2 (the second data release from the survey programme The INT Photometric H-alpha Survey of the Northern Galactic Plane, IPHAS), is an example of modern astronomy’s exploitation of ‘big data’. It contains information on 219 million detected objects, each of which is summarised in 99 different attributes.

The new work appears in Barentsen et al, “The second data release of the INT Photometric Hα Survey of the Northern Galactic Plane (IPHAS DR2)“, Monthly Notices of the Royal Astronomical Society, vol. 444, pp. 3230-3257, 2014, published by Oxford University Press. A preprint version is available on the arXiv server.

The catalogue is accessible in queryable form via the VizieR service at the Centre de Données astronomiques de Strasbourg. The processed IPHAS images it is derived from are also publically available.

At 219 million detected objects, each with 99 different attributes, that sounds like “big data” to me.

Enjoy!

AverageExplorer:…

Sun, 08/17/2014 - 21:22

Categories:

Topic Maps

AverageExplorer: Interactive Exploration and Alignment of Visual Data Collections, Jun-Yan Zhu, Yong Jae Lee, and Alexei Efros.

Abstract:

This paper proposes an interactive framework that allows a user to rapidly explore and visualize a large image collection using the medium of average images. Average images have been gaining popularity as means of artistic expression and data visualization, but the creation of compelling examples is a surprisingly laborious and manual process. Our interactive, real-time system provides a way to summarize large amounts of visual data by weighted average(s) of an image collection, with the weights reflecting user-indicated importance. The aim is to capture not just the mean of the distribution, but a set of modes discovered via interactive exploration. We pose this exploration in terms of a user interactively “editing” the average image using various types of strokes, brushes and warps, similar to a normal image editor, with each user interaction providing a new constraint to update the average. New weighted averages can be spawned and edited either individually or jointly. Together, these tools allow the user to simultaneously perform two fundamental operations on visual data: user-guided clustering and user-guided alignment, within the same framework. We show that our system is useful for various computer vision and graphics applications.

Applying averaging to images, particularly in an interactive context with users, seems like a very suitable strategy.

What would it look like to have interactive merging of proxies based on data ranges controlled by the user?

Value-Loss Conduits?

Sun, 08/17/2014 - 20:52

Categories:

Topic Maps

Do you remove links from materials that you quote?

I ask because of the following example:

The research, led by Alexei Efros, associate professor of electrical engineering and computer sciences, will be presented today (Thursday, Aug. 14) at the International Conference and Exhibition on Computer Graphics and Interactive Techniques, or SIGGRAPH, in Vancouver, Canada.

“Visual data is among the biggest of Big Data,” said Efros, who is also a member of the UC Berkeley Visual Computing Lab. “We have this enormous collection of images on the Web, but much of it remains unseen by humans because it is so vast. People have called it the dark matter of the Internet. We wanted to figure out a way to quickly visualize this data by systematically ‘averaging’ the images.”

Which is a quote from: New tool makes a single picture worth a thousand – and more – images by Sarah Yang.

Those passages were reprinted by Science Daily reading:

The research, led by Alexei Efros, associate professor of electrical engineering and computer sciences, was presented Aug. 14 at the International Conference and Exhibition on Computer Graphics and Interactive Techniques, or SIGGRAPH, in Vancouver, Canada.

“Visual data is among the biggest of Big Data,” said Efros, who is also a member of the UC Berkeley Visual Computing Lab. “We have this enormous collection of images on the Web, but much of it remains unseen by humans because it is so vast. People have called it the dark matter of the Internet. We wanted to figure out a way to quickly visualize this data by systematically ‘averaging’ the images.”

Why leave out the hyperlinks for SIGGRAPH and the Visual Computing Laboratory?

Or for that matter, the link to the original paper: AverageExplorer: Interactive Exploration and Alignment of Visual Data Collections (ACM Transactions on Graphics, SIGGRAPH paper, August 2014) which appeared in the news release.

All three hyperlinks enhance your ability to navigate to more information. Isn’t navigation to more information a prime function of the WWW?

If so, we need to clue ScienceDaily and other content repackagers to include hyperlinks passed onto them, at least.

If you can’t be a value-add, at least don’t be a value-loss conduit.

TCP Stealth

Sun, 08/17/2014 - 20:31

Categories:

Topic Maps

New “TCP Stealth” tool aims to help sysadmins block spies from exploiting their systems by David Meyer.

From the post:

System administrators who aren’t down with spies commandeering their servers might want to pay attention to this one: A Friday article in German security publication Heise provided technical detail on a GCHQ program called HACIENDA, which the British spy agency apparently uses to port-scan entire countries, and the authors have come up with an Internet Engineering Task Force draft for a new technique to counter this program.

The refreshing aspect of this vulnerability is that the details are being discussed in public, as it a partial solution.

Perhaps this is a step towards transparency for cybersecurity. Keeping malicious actors and “security researchers” only in the loop hasn’t worked out so well.

Whether governments fall into “malicious actors” or “security researchers” I leave to your judgement.

Bizarre Big Data Correlations

Sun, 08/17/2014 - 20:16

Categories:

Topic Maps

Chance News 99 reported the following story:

The online lender ZestFinance Inc. found that people who fill out their loan applications using all capital letters default more often than people who use all lowercase letters, and more often still than people who use uppercase and lowercase letters correctly.

ZestFinance Chief Executive Douglas Merrill says the company looks at tens of thousands of signals when making a loan, and it doesn’t consider the capital-letter factor as significant as some other factors—such as income when linked with expenses and the local cost of living.

So while it may take capital letters into consideration when evaluating an application, it hasn’t held a loan up because of it.

Submitted by Paul Alper

If it weren’t an “online lender,” ZestFinance could take into account applications signed in crayon.

Chance News collects stories with a statistical or probability angle. Some of them can be quite amusing.

Titan 0.5 Released!

Sun, 08/17/2014 - 00:30

Categories:

Topic Maps

Titan 0.5 Released!

From the Titan documentation:

1.1. General Titan Benefits

  • Support for very large graphs. Titan graphs scale with the number of machines in the cluster.
  • Support for very many concurrent transactions and operational graph processing. Titan’s transactional capacity scales with the number of machines in the cluster and answers complex traversal queries on huge graphs in milliseconds.
  • Support for global graph analytics and batch graph processing through the Hadoop framework.
  • Support for geo, numeric range, and full text search for vertices and edges on very large graphs.
  • Native support for the popular property graph data model exposed by Blueprints.
  • Native support for the graph traversal language Gremlin.
  • Easy integration with the Rexster graph server for programming language agnostic connectivity.
  • Numerous graph-level configurations provide knobs for tuning performance.
  • Vertex-centric indices provide vertex-level querying to alleviate issues with the infamous super node problem.
  • Provides an optimized disk representation to allow for efficient use of storage and speed of access.
  • Open source under the liberal Apache 2 license.

A major milestone in the development of Titan!

If you are interested in serious graph processing, Titan is one of the systems that should be on your short list.

PS: Matthias Broecheler has posted Titan 0.5.0 GA Release, which has links to upgrade instructions and comments about a future Titan 1.0 release!

our new robo-reader overlords

Fri, 08/15/2014 - 23:18

Categories:

Topic Maps

our new robo-reader overlords by Alan Jacobs.

After you read this post by Jacobs, be sure to spend time with Flunk the robo-graders by Les Perelman (quoted by Jacobs).

Both raise the issue of what sort of writing can be taught by algorithms that have no understanding of writing?

In a very real sense, the outcome can only be writing that meets but does not exceed what has been programmed into an algorithm.

That is frightening enough for education, but if you are relying on AI or machine learning for intelligence analysis, your stakes may be far higher.

To be sure, software can recognize “send the atomic bomb triggers by Federal Express to this address….,” or at least I hope that is within the range of current software. But what if the message is: “The destroyer of worlds will arrive next week.” Alert? Yes/No? What if it was written in Sanskrit?

I think computers, along with AI and machine learning can be valuable tools, but not if they are setting the standard for review. At least if you don’t want to dumb down writing and national security intelligence to the level of an algorithm.

I first saw this in a tweet by James Schirmer.

Applauding The Ends, Not The Means

Fri, 08/15/2014 - 21:25

Categories:

Topic Maps

Microsoft scans email for child abuse images, leads to arrest‏ by Lisa Vaas.

From the post:

It’s not just Google.

Microsoft is also scanning for child-abuse images.

A recent tip-off from Microsoft to the National Center for Missing & Exploited Children (NCMEC) hotline led to the arrest on 31 July 2014 of a 20-year-old Pennsylvanian man in the US.

According to the affidavit of probable cause, posted on Smoking Gun, Tyler James Hoffman has been charged with receiving and sharing child-abuse images.

Shades of the days when Kodak would censor film submitted for development.

Lisa reviews the PhotoDNA techniques used by Microsoft and concludes:

The recent successes of PhotoDNA in leading both Microsoft and Google to ferret out child predators is a tribute to Microsoft’s development efforts in coming up with a good tool in the fight against child abuse.

In this particular instance, given this particular use of hash identifiers, it sounds as though those innocent of this particular type of crime have nothing to fear from automated email scanning.

No sane person supports child abuse so the outcome of the case doesn’t bother me.

However, the use of PhotoDNA isn’t limited to photos of abused children. The same technique could be applied to photos of police officers abusing protesters (wonder where you would find those?), etc.

Before anyone applauds Microsoft for taking the role of censor (in the Roman sense), remember that corporate policies change. The goals of email scanning may not be so agreeable tomorrow.

XPERT (Xerte Public E-learning ReposiTory)

Fri, 08/15/2014 - 17:43

Categories:

Topic Maps

XPERT (Xerte Public E-learning ReposiTory)

From the about page:

XPERT (Xerte Public E-learning ReposiTory) project is a JISC funded rapid innovation project (summer 2009) to explore the potential of delivering and supporting a distributed repository of e-learning resources created and seamlessly published through the open source e-learning development tool called Xerte Online Toolkits. The aim of XPERT is to progress the vision of a distributed architecture of e-learning resources for sharing and re-use.

Learners and educators can use XPERT to search a growing database of open learning resources suitable for students at all levels of study in a wide range of different subjects.

Creators of learning resources can also contribute to XPERT via RSS feeds created seamlessly through local installations of Xerte Online Toolkits. Xpert has been fully integrated into Xerte Online Toolkits, an open source content authoring tool from The University of Nottingham.

Other useful links:

Xerte Project Toolkits

Xerte Community.

You may want to start with the browse option because the main interface is rather stark.

The Google interface is “stark” in the same sense but Google has indexed a substantial portion of all online content. I’m not very likely to draw a blank. Xpert, with a base of 364,979 resources, the odds of my drawing a blank are far higher.

The keywords are in three distinct alphabetical segments, starting with “a” or a digit, ending and then another digit or “a” follows and end, one after the other. Hebrew and what appears to be Chinese appears at the end of the keyword list, in no particular order. I don’t know if that is an artifact of the software or of its use.

The same repeated alphabetical segments occurs in Author. Under Type there are some true types such as “color print” but the majority of the listing is file sizes in bytes. Not sure why file size would be a “type.” Institution has similar issues.

If you are looking for a volunteer opportunity, helping XPert with alphabetization would enhance the browsing experience for the resources it has collected.

I first saw this in a tweet by Graham Steel.

Photoshopping The Weather

Fri, 08/15/2014 - 15:23

Categories:

Topic Maps

Photo editing algorithm changes weather, seasons automatically

From the post:

We may not be able control the weather outside, but thanks to a new algorithm being developed by Brown University computer scientists, we can control it in photographs.

The new program enables users to change a suite of “transient attributes” of outdoor photos — the weather, time of day, season, and other features — with simple, natural language commands. To make a sunny photo rainy, for example, just input a photo and type, “more rain.” A picture taken in July can be made to look a bit more January simply by typing “more winter.” All told, the algorithm can edit photos according to 40 commonly changing outdoor attributes.

The idea behind the program is to make photo editing easy for people who might not be familiar with the ins and outs of complex photo editing software.

“It’s been a longstanding interest on mine to make image editing easier for non-experts,” said James Hays, Manning Assistant Professor of Computer Science at Brown. “Programs like Photoshop are really powerful, but you basically need to be an artist to use them. We want anybody to be able to manipulate photographs as easily as you’d manipulate text.”

A paper describing the work will be presented next week at SIGGRAPH, the world’s premier computer graphics conference. The team is continuing to refine the program, and hopes to have a consumer version of the program soon. The paper is available at http://transattr.cs.brown.edu/. Hays’s coauthors on the paper were postdoctoral researcher Pierre-Yves Laffont, and Brown graduate students Zhile Ren, Xiaofeng Tao, and Chao Qian.

For all the talk about photoshopping models, soon the Weather Channel won’t send reporters to windy, rain soaked beaches, snow bound roads, or even chasing tornadoes.

With enough information, the reporters can have weather effects around them simulated and eliminate the travel cost for such assignments.

Something to keep in mind when people claim to have “photographic” evidence. Goes double for cellphone video. A cellphone only captures the context selected by its user. A non-photographic distortion that is hard to avoid.

I first saw this in a tweet by Gregory Piatetsky.