Facebook announced today that its developers conference—F8—will return after a hiatus of three years on April 30th at at the San Francisco Design Concourse. Parse CEO Ilya Sukhar announced the return of f8 at an event at South By Southwest in Austin, Texas.
Event registration and details will be made by Facebook shortly. For more information, see the F8 event page here.
Running from Friday February 28 to Sunday March 2, DrupalCamp London was the second largest DrupalCamp ever seen in Europe: Attended by over 600 people, it included a CxO day, over 30 sessions and Bofs, and high-caliber keynotes from organisations ranging from Cancer Research and Government Digital Service to Drupal Association and Acquia.Drupal's growing up
The event itself had more of a DrupalCon feel than a Camp - emphasising the view that Drupal community is 'growing up', something that Associate Director at Drupal Association, Megan Sanicki, also hit home in her keynote. We were lucky enough to catch up with Megan where she expanded on topics covered in her speech.Expanding the Drupal talent pool
Talent was also very much part of the focus of DrupalCamp London. This was not only reflected in the attendance of a group of Drupal Apprentices (above) who met with potential employees at a special Bof, but also with the inclusion of Eric Gaffen, Global Manager, Talent Acquisition at Acquia. Eric travelled over from Boston to be at the event, and in this great short video told us why nurturing great Drupal talent is important for the whole of the Drupal ecosystem.Networking, sessions... and swag!
Deeson Online not only sponsored it, MD Tim Deeson was also one of the organisers. And as this Vine proves, was also king of networking...
Deeson Online Developer Annika Clarke gave a session entitled Introducing Demo Framework, a distribution that aims to make the process of pitching Drupal to new clients a lot easier, while Solutions Architect John Ennew delivered his session on concurrent programming in Drupal. We'll be putting up their well-attended presentations very soon - so watch this space.
But as is the case with anything vaguely Drupaly... it's the swag that people really love, and our We Are Smarter Than Me tees went down a storm again!
2014 DrupalCamp London definitely upped the game, and next year's event will no doubt build on the great foundations set down last weekend. We're looking forward to it already!
Thirty years in, one of gaming’s weirdest persisting legends is about to get mythbusted.
The story is one of a fallen gaming great’s secret shame: a landfill containing millions of copies of Atari’s worst-selling, worst-received game ever: E.T. the Extra-Terrestrial. Atari, a multi-billion industry leader in its prime, remains a compelling case study in corporate collapse, and a nostalgic soft spot for gamers the world over, who are dying to see what comes out of the landfill.
For decades gamers have speculated about the site’s whereabouts, long thought to be somewhere near Area 51—a fact that amps up the mystery factor, naturally. Xbox, now producing its own original content, took interest in unearthing the mystery and airing the next chapter in a quasi-fable that loyal gamers have followed for three decades.
Fuel Entertainment’s Mike Burns, a longtime Atari fan, is partnering with documentary filmmakers Simon and Jonathan Chinn of Lightbox to provide Microsoft a run of five to ten Xbox Live-exclusive original films, starting with the hour-long "Dumping The Alien: Unearthing The Atari Graveyard." The team will be literally digging up Atari’s so-called “concrete tomb”—pinpointed to Alamogordo, New Mexico, in the coming months. Lightbox’s Jonathan Chinn and Fuel’s Burns swung by SXSW to update the gaming world on their progress.Bringing The Myth To Light
Atari’s mega-flop, 1982’s E.T. the Extra-Terrestrial for the Atari 2600, was commissioned as a companion experience to the wild success of the feature film. Atari’s parent company Warner Entertainment misgauged the gaming community in a decision that proved fatal: instead of easy sales, the game was met with near universal derision—a failure that factored into the massive losses the company experienced starting in 1983. “It’s absolutely unplayable,” says Chinn. "It’s the worst game of all time.”
Gaining access to the site required written permission from the city of Alamagordo, a process that took about a year and a half, including sit-down meetings to convince the mayor of the cultural significance of the city’s (second) odd claim to fame. Lots of things about Alamogordo are weird. The first atomic bomb was detonated there, for one.
“If we have to wear hazmat suits in order to excavate it, we will,” says Chinn. "As long as we don’t have a bulldozer hit an atomic bomb that wasn’t detonated—but that would make a theatrical release for sure. I think there’s gonna be a lot of stuff there … it’s not just like a little treasure chest. It’s 10 truckloads of Atari merchandise.”
According to Chinn, even the Smithsonian Institute has expressed interest in taking home a piece of whatever retro gaming history is unearthed. But the excavation team thinks they’ll be so much that, ironically, they’ll probably have to trash the bulk of it all over again.More Than A Crappy Game
The project—undertaken out of sheer fanboy curiosity before Microsoft was involved—is about more than solving a mystery. The dump site, half urban legend, half corporate failure coverup, symbolizes a half billion dollar misfire that went down in history. “Atari should be Apple,” says Chinn. “What the hell happened at Atari? We wanted to unearth the story of why a company that had everything going for it failed.”
For the gaming community, Atari’s E.T. flop is also an emblem of the disconnect between Warner Communications—which bought Atari in 1976—and the era's nascent gaming community. The game’s designer, Howard Scott Warshaw, was given an insanely brief six weeks to create the game, start to finish.
“Some say that considering he had six weeks, it was a masterpiece,” says Chinn. “Atari means something to all of us. The idea that I had video games in my house that I could play whenever I wanted—that’s in a way what the promise of the digital revolution was about.”
When Chinn asked a room of maybe 100 people here at South By Southwest Interactive how many would be interested in making the pilgrimage to New Mexico for the dig in the coming months, roughly 50 raised their hands. The team behind the film invites gamers to show up and take home a literal part of video game history.
But what if they come up empty handed? Chinn remains confident: “I know they’re there. I know they’re somewhere in that landfill. We can’t dig up the whole landfill—it’s hundreds of acres. If it’s not there … well, we’re going to keep digging until we find it.”
Header image by Taylor Hatmaker for ReadWrite
Weeks after Pebble gave iPhone users access to its smartwatch app store, the company has finally opened it up to Android users as well. At the announcement, Pebble also revealed a new set of app partners, one of whom now offers an intriguing new use for the smartwatch: the ability to control smart homes from your wrist.See also: 10 Cool Things A Pebble Smartwatch Can Do
Pebble had previously announced a partnership with iControl, whose technology powers smart home services from companies like Comcast Cable, ADT and Time Warner Cable. TWC has now released a Pebble app for its Intelligent Home subscribers. Using the watch, they can change the control modes for their connected-home gadgets—from "home" to "away," for instance, which might turn off lights and lower the thermostat. They can also monitor their thermostats and even change the ambient temperature before they arrive home, among other things.
Pebble's previously announced Mercedes-Benz partnership also materialized this week with its new DriveStyle Pebble app. The killer features include vibration alerts for road hazards, accidents and speed limits, as well as control over navigation, music and social networks. The app, however, only works with a Mercedes vehicle (not included) and an iPhone. You'll also need Drive Kit Plus, an iPhone integration setup for the Mercedes.
Apps from the two other new Pebble partners, eBay and Evernote, are a bit more universal. The former lets users find products on the online auction site, tap into eBay Feeds and add items to Watch Lists. The latter puts Evernote checklists, reminders and notebooks on your wrist.
These join other smartwatch apps—including Yelp, ESPN, Foursquare and GoPro, among others—which are all available in the Pebble Appstore. Android users can get access by updating their Pebble mobile app to version 2.0 in Google Play.
Though some of these apps have a limited audience, they speak to the expanding features and usability of Pebble, in particular, and perhaps smartwatches in general—suggesting that the next great mobile revolution really might be within arm's reach.
Feature image by Adriana Lee for ReadWrite; all others courtesy of Pebble
MOOCs, or massive open online courses, are quickly becoming technology darlings. Companies like Coursera, Udacity, edX and others provide college-caliber online courses taught by professors from the most prestigious universities. Millions of students interested in pursuing inexpensive post-secondary education can take classes on anything from nutritional health to machine learning—right from the comfort of their own home.
It’s not just about learning new skills. "Graduates" of these classes can receive paid course certificates or accreditation, which is always great to showcase on LinkedIn. Some organizations, like Udacity, have even partnered with universities to create entirely MOOC-based degrees.
I registered for a five-week course on Coursera, Terrorism and Counterterrorism: Comparing Theory And Practice. I’m interested in global politics and how the definition and scope of terrorism has changed since September 11, 2001, and since the topic was equally intriguing and different from the tech community I’m knee-deep in, I figured this class would provide a good introduction to massive open online courses.
The course was available under Coursera’s “Signature Track” program, so I paid $49 to receive a certificate of completion when I passed the class. It was a waste of $49.
I failed my first MOOC.
It wasn’t for lack of trying. When I first signed up, I took it very seriously.MOOCs Are Not A Substitute For College
I’ve argued, and still believe, the traditional university lecture is dead. As online education programs skyrocket in popularity, brick-and-mortar universities are embracing aspects of the online college lecture, like interactive videos and online discussion forums.
The difference is, MOOC professors are teaching thousands of students—hundreds of thousands in some cases—thus eliminating the intimacy of one-on-one interactions that are so beneficial in most offline classroom settings.See Also: Udacity Ignores Reality, Founds Open Education Alliance
My Coursera professor, Edwin Bakker from Leiden University in the Netherlands, taught the course via video lectures. He provided great insight, paired it with interesting required readings, and led Google Hangouts throughout the course, though only a handful of students were able to participate. Time zone differences and limited space ultimately resulted in a select few students receiving the opportunity to participate in this more intimate online setting.
Furthermore, the MOOC system for reviewing and grading submitted material is still imperfect. Granted, automatically-graded quizzes make it easy to keep track of one's marks, and instructors or teaching assistants are good at providing feedback through discussion forums or otherwise, but assignments that required me to submit essays or complex answers beyond multiple-choice questions weren't graded by the instructor—which, in my case, turned out to be detrimental to the overall class experience.You Just Can’t Trust The Internet
In my entire college career, I never failed a class. I pulled all-nighters to study for tests and write essays, and all the work I put in eventually paid off. My Coursera class was a totally different story.
I'll admit it: I had minimal motivation. Sure, I didn’t want to waste $49, but I certainly didn’t stay up all night finishing a 600-word essay—the goal of receiving a course completion certificate just wasn't appealing enough.
Students on the Signature Track were required to submit two essays and pass multiple quizzes. The quizzes were easy—we were given multiple attempts to get a perfect score—but the essays were a different story. Since the professor was unable to grade them himself, each student was subject to peer reviews—five of them. And each review impacted your grade.
Students were given a rubric to follow, and the graders would base their assessment off that. To pass, we needed to get 60% on each essay—this would account for 30% of the final grade.See Also: Online Education Is Trying Very Hard To Make Itself More Respectable
I failed my first essay. All but one reviewer gave me a failing grade, for reasons unknown.
One reviewer claimed my using Fox News as a source rendered all my other sources meaningless. (Normally I would agree with the commenter, however it was an essay about the Oklahoma City Bombing, and I linked to bomber Timothy McVeigh’s letter to Fox News. You can read my essay here.)
Admittedly, the essay was not my best work. When I’m taking a college-level course without paying college-level prices, or getting anything in return besides knowledge or a completion certificate, I simply won’t try as hard. But I did follow the rubric and met all the requirements for a passing grade.
In true Internet fashion, these peer reviews were totally anonymous. I couldn’t discuss with my reviewer why he or she thought my essay was lousy, and I couldn’t defend my link to Fox News. I felt uncomfortable and powerless. Stupid. This is not an environment that encourages productive learning.
To achieve certification, students must finish both essays and grade other students’ contributions. I knew my next essay would be just as bad as the first one, considering the amount of time I spent writing it, and knowing I couldn’t give anymore of my already busy schedule to this class, I failed.A Probability Of Failure
I wish I could say my experience was unique. But if you sign up for a massive open online course, chances are you won’t finish it.
On Coursera, the average student retention rate is just four percent. No more than 51 percent of students passed Udacity’s online math program offered at San Jose State University. And according to a study released in May 2013, the average MOOC completion rate was just 6.8 percent, and the six most-completed courses relied on automatic testing, not peer review grading.
Completion rates for MOOCs are so poor, Udacity’s founder Sebastian Thrun admitted his company doesn’t educate people the way he intended.
In an effort to combat abysmal completition rates, Coursera is creating degree-like programs that give paying students a more substantial certificate of completion after passing all the classes in a specialization certificate group.
“We do believe doing a capstone project and earning a specialization certificate will provide greater incentive and motivation for students to complete,” Coursera cofounder and co-CEO Andrew Ng told ReadWrite earlier this year.
As much as I wanted to finish my course, the time restrictions and the grading process turned me off. And thus, I became just another one of the vast majority of students who fail massive open online courses.How MOOCs Can Succeed
There are a variety of factors that, if implemented, would make me want to take another online course.
For starters, anonymous grading should not determine the students' success—at least not by itself. If I had the ability to defend myself and possibly change a grade, I might be more inclined to get actively involved. In college, I was always allowed, if not encouraged, to meet with the professor or teaching assistant who graded my work to challenge or ask questions if I didn't agree with the final grade. Even if I didn't change someone's mind, chatting with someone made feel more at ease.
YouTube, one of the world's leading online social platforms, recently nixed anonymous comments; now, anyone who chooses to leave feedback on a video must do so with their Google+ profile attached to it. If comments on cat videos require a personal identity, then I think essays for online courses should, too.
One reason MOOCs are so popular is because they're so cheap. While this is good for many students that can't afford a traditional college route, other students require further incentive. The price point for certificates of completion is relatively inexpensive—unlike universities that cost an arm and a leg.
A former Coursera student told me earlier this year that he would rather pay $600 for a class offered through a university than take a similar subject online, simply because he knew he would be more inclined to finish it with a significant investment. Coursera's initiative into specialty courses are aiming to do this: By charging more money for a more comprehensive program, students are incentivized to finish the courses they paid good money for, and the program becomes more well-rounded, too.
MOOCs provide invaluable resources for continuing education and opportunities for students to take courses they might not have otherwise taken. But when I compare my experience, albeit just one course, to the education I received at a traditional university, I wouldn’t trade my in-person college career for a suite of online class credentials, no matter how many university heavyweights stand behind them.
Lead image by Adriana Lee for ReadWrite.
In it Megan highlighted key elements from her keynote speech including how:
- Drupal Association is stepping up their mission to further the project and help expand the community
- They plan to grow from 3% to 10 % of the web
- They're investing £850k to improve Drupal.org including marketing to help people understand why they should adopt Drupal 8.
You can also see the slide deck from her DrupalCamp London keynote below. In it Megan shares in more detail the Association's vision and the programs being implemented to help make that happen.Drupal Camp London Drupal Association Keynote 2014 from Drupal Association
Google chairman Eric Schmidt and Director of Google Ideas Jared Cohen led a discussion at today’s SXSW event in Austin, Texas, which described the ways technology is impacting privacy, security and policy on a global scale.
The two Google execs said they visited at least 35 countries, a majority of them unstable autocracies such as North Korea, and examined the impact technology is having on citizens in those countries.See also: Google's Game Of Moneyball In The Age Of Artificial Intelligence
As citizens are empowered with mobile phones and connectivity, Eric Schmidt said “revolutions are going to be easier to start, but harder to finish."
According to a report from Pew Internet, emerging nations are catching up to the U.S. when it comes to technology adoption, specifically of mobile devices and social media.
Although Internet access is not nearly as prevalent in developing countries, those who do have access are using social media to reach out to the world. For instance, in Egypt, 88% of Internet users are using social media—taking to those websites to drive global awareness of violence and uprisings around the Arab Spring. Stories surrounding the Syrian civil war are also being told on social media—so much so that the government is attacking its citizens to combat information leaving the country.
Cohen said the situation in Syria is so bad that people are getting killed over their social media posts.
Grassroots revolutions like these have inspired governments to try and control the Internet. But what they’ve found is that by turning off the Internet, they admit they are afraid. Schmidt said dictators' new models revolve around infiltrating and manipulating the Internet, as opposed to shutting it off completely.
Per Cohen:In the major cities in Damascus, the government has set up check points and ask you for your phone and login information... My friend’s brother resisted and they held a gun to his head.
According to Cohen, his friend's brother eventually relented and gave the government officers his phone. The officials saw something posted on the man's social media page that was sympathetic to the opposition, and so he was shot.Protecting Citizen Data
When last year's Edward Snowden revelations revealed governments were collecting personal data from tech companies, Google was very surprised by the behavior of both the U.S. government and Great Britain’s security administration.
Since that time, Google now encrypts data at multiple points of source by using 2048-bit encryption and perfect forward secrecy that switches keys at every session. In other words, it's now way more difficult to get Google's data than it was before the NSA revelations.
Both Schmidt and Cohen support whistleblowers and leaking potentially scandalous information, but the Google executives believe there needs to be better methods for disclosing information and protecting those who come forward.
“Without oversight and without people watching things, misuse can occur,” Schmidt said. “Somebody within those organizations should have said, ‘What happens when someone discovers this?’”Keeping The Internet Open
The Internet is an open highway, available to people without restrictions in many parts of the world. But as governments fight for control of their people, they often fight for control of the information portal that continues to give them a voice.
In the book The New Digital Age, Schmidt writes about the balkanization of the Internet, and said at the SXSW lecture that it’s entirely possible for governments to create their own intranets to control the flow of information.See also: Facebook Drones May Soon Be A Reality
Iran is the first country to propose such an option. In 2012, the country pushed for a “national Internet,” which promised to wall off a part of cyberspace for its citizens’ use and therefore be able to control every aspect of it. As a result, Google blocked Gmail in Iran shortly thereafter.
“We’re worried that not only will the balkanization will occur, but gradually in a way that no one notices it,” he said. “They might use child safety as a starting point.”
Russia is another country that seeks to control online information. (Ironically, it’s where Snowden is allegedly staying to avoid U.S. prosecution.) Russia allows for the arbitrary removal of videos that feature young children, but the country casts a wide net to take down any videos they disagree with.
People want to control their privacy and governments want to control their citizens' data, which has typically caused a great deal of dissonance. But at Friday's SXSW talk, Schmidt discussed two new trends in technology that are driving both the people and the government to take control of their own information.
“One is empowerment of citizens with mobile devices—they are supercomputers,” Schmidt said. “The other thing is that information once published publicly is no longer revocable.”
As Google continues its push to make Internet available on a wider scale through projects like Fiber and Loon, its executives are making sure those information superhighways continue to remain open for everyone, even as the world's governments vie to control the pipes.
Lead image by Selena Larson for ReadWrite
Who Are the Customers for Intelligence? by Peter C. Oleson.
From the paper:
Who uses intelligence and why? The short answer is almost everyone and to gain an advantage. While nation-states are most closely identified with intelligence, private corporations and criminal entities also invest in gathering and analyzing information to advance their goals. Thus the intelligence process is a service function, or as Australian intelligence expert Don McDowell describes it,
Information is essential to the intelligence process. Intelligence… is not simply an amalgam of collected information. It is instead the result of taking information relevant to a specific issue and subjecting it to a process of integration, evaluation, and analysis with the specific purpose of projecting future events and actions, and estimating and predicting outcomes.
It is important to note that intelligence is prospective, or future oriented (in contrast to investigations that focus on events that have already occurred).
As intelligence is a service, it follows that it has customers for its products. McDowell differentiates between “clients” and “customers” for intelligence. The former are those who commission an intelligence effort and are the principal recipients of the resulting intelligence product. The latter are those who have an interest in the intelligence product and could use it for their own purposes. Most scholars of intelligence do not make this distinction. However, it can be an important one as there is an implied priority associated with a client over a customer. (footnote markers omitted)
If you want to sell the results of topic maps, that is highly curated data that can be viewed from multiple perspectives, this essay should spark your thinking about potential customers.
You may also find this website useful: Association of Former Intelligence Officers.
Quizz: Targeted Crowdsourcing with a Billion (Potential) Users by Panagiotis G. Ipeirotis and Evgeniy Gabrilovich.
We describe Quizz, a gamified crowdsourcing system that simultaneously assesses the knowledge of users and acquires new knowledge from them. Quizz operates by asking users to complete short quizzes on specific topics; as a user answers the quiz questions, Quizz estimates the user’s competence. To acquire new knowledge, Quizz also incorporates questions for which we do not have a known answer; the answers given by competent users provide useful signals for selecting the correct answers for these questions. Quizz actively tries to identify knowledgeable users on the Internet by running advertising campaigns, effectively leveraging the targeting capabilities of existing, publicly available, ad placement services. Quizz quantifies the contributions of the users using information theory and sends feedback to the advertising system about each user. The feedback allows the ad targeting mechanism to further optimize ad placement.
Our experiments, which involve over ten thousand users, confirm that we can crowdsource knowledge curation for niche and specialized topics, as the advertising network can automatically identify users with the desired expertise and interest in the given topic. We present controlled experiments that examine the effect of various incentive mechanisms, highlighting the need for having short-term rewards as goals, which incentivize the users to contribute. Finally, our cost- quality analysis indicates that the cost of our approach is below that of hiring workers through paid-crowdsourcing platforms, while offering the additional advantage of giving access to billions of potential users all over the planet, and being able to reach users with specialized expertise that is not typically available through existing labor marketplaces.
Crowd sourcing isn’t an automatic slam-dunk but with research like this, it will start moving towards being a repeatable experience.
What do you want to author using a crowd?
I first saw this at Greg Linden’s More quick links.
From the post:
We work with a lot of data at ProPublica. It's a big part of almost everything we do — from data-driven stories to graphics to interactive news applications. Today we're launching the ProPublica Data Store, a new way for us to share our datasets and for them to help sustain our work.
Like most newsrooms, we make extensive use of government data — some downloaded from "open data" sites and some obtained through Freedom of Information Act requests. But much of our data comes from our developers spending months scraping and assembling material from web sites and out of Acrobat documents. Some data requires months of labor to clean or requires combining datasets from different sources in a way that's never been done before.
For datasets that are the result of significant expenditures of our time and effort, we're charging a reasonable one-time fee: In most cases, it's $200 for journalists and $2,000 for academic researchers. Those wanting to use data commercially should reach out to us to discuss pricing. If you're unsure whether a premium dataset will suit your purposes, you can try a sample first. It's a free download of a small sample of the data and a readme file explaining how to use it.
The datasets contain a wealth of information for researchers and journalists. The premium datasets are cleaned and ready for analysis. They will save you months of work preparing the data. Each one comes with documentation, including a data dictionary, a list of caveats, and details about how we have used the data here at ProPublica.
A data store you can feel good about supporting!
I first saw this at Nathan Yau’s ProPublica opened a data store.
WorldCat Works Linked Data – Some Answers To Early Questions by Richard Wallis.
The most interesting question Richard answers:
Q Is there a bulk download available?
No there is no bulk download available. This is a deliberate decision for several reasons.
Firstly this is Linked Data – its main benefits accrue from its canonical persistent identifiers and the relationships it maintains between other identified entities within a stable, yet changing, web of data. WorldCat.org is a live data set actively maintained and updated by the thousands of member libraries, data partners, and OCLC staff and processes. I would discourage reliance on local storage of this data, as it will rapidly evolve and become out of synchronisation with the source. The whole point and value of persistent identifiers, which you would reference locally, is that they will always dereference to the current version of the data.
I will give you one guess on who is deciding on the entities, identifiers and relationships to be maintained.
Hint: It’s not you.
Which in my view is one of the principal weaknesses of Linked Data.
In order to participate, you have to forfeit your right to organize your world differently than it has been organized by Richard Wallis, WorldCat and others.
I am sure they all have good intentions and WorldCat will come close enough for most of my purposes, but I’m not interested in a one world view, whoever agrees with it. Even me.
If you are good with graphics, take the original Apple commercial:
and reverse it.
Show users and screen of vivid diversity and show a Richard Wallis look alike touching the side of the projection screen and the uniform grayness of linked data starts to spread across it. As it does, the users in the audience who have been in traditional dress start to look like the starting audience in Apple’s 1984 commercial.
That’s the intellectual landscape that linked data promises. Do you really want to go there?
Nothing against standards, I have helped write one or two them. But I do oppose uniformity for the sake of empowering self-appointed guardians.
Particularly when that uniformity is a tepid grey that doesn’t reflect the rich and discordant hues of human intellectual history.
Using Lucene’s search server to search Jira issues by Michael McCandless.
From the post:
That application has become a powerful showcase of a number of modern Lucene features such as drill sideways and dynamic range faceting, a new suggester based on infix matches, postings highlighter, block-join queries so you can jump to a specific issue comment that matched your search, near-real-time indexing and searching, etc. Whenever new users ask me about Lucene’s capabilities, I point them to this application so they can see for themselves.
Recently, I’ve made some further progress so I want to give an update.
The source code for the simple Netty-based Lucene server is now available on this subversion branch (see LUCENE-5376 for details). I’ve been gradually adding coverage for additional Lucene modules, including facets, suggesters, analysis, queryparsers, highlighting, grouping, joins and expressions. And of course normal indexing and searching! Much remains to be done (there are plenty of nocommits), and the goal here is not to build a feature rich search server but rather to demonstrate how to use Lucene’s current modules in a server context with minimal “thin server” additional source code.
Separately, to test this new Lucene based server, and to complete the “dog food,” I built a simple Jira search application plugin, to help us find Jira issues, here. This application has various Python tools to extract and index Jira issues using Jira’s REST API and a user-interface layer running as a Python WSGI app, to send requests to the server and render responses back to the user. The goal of this Jira search application is to make it simple to point it at any Jira instance / project and enable full searching over all issues.
Of particular interest to me because OASIS is about to start using JIRA 6.2 (the version in use at Apache).
I haven’t looked closely at the documentation for JIRA 6.2.
Thoughts on where it has specific weaknesses that are addressed by Michael’s solution?
Today Facebook announced plans for a new data center in Luleå, Sweden—one based on modular architectural concept the company calls “rapid deployment data center,” or RDDC. One of the construct approaches, “flat pack”—basically a way of packing together the modular walls of a data center into easily transportable units, much like a box containing a disassembled bookshelf—was inspired by Ikea, the minimalist furniture and home accessory company that's also based in Sweden.
Sadly, there's no word on the assembly instructions, which are presumably in pictorial form, or whether hex wrenches are included with every set.
Image courtesy of Facebook
Please visit Search Engine Land for the full article.
From the post:
The explosion of data is leading to new business opportunities that draw on advanced analytics and require a broader, more sophisticated skills set, including software development, data engineering, math and statistics, subject matter expertise, and fluency in a variety of analytics tools. Brought together by data scientists, these capabilities can lead to deeper market insights, more focused product innovation, faster anomaly detection, and more effective customer engagement for the business.
The Data Science Challenge Solution Kit is your best resource to get hands-on experience with a real-world data science challenge in a self-paced, learner-centric environment. The free solution kit includes a live data set, a step-by-step tutorial, and a detailed explanation of the processes required to arrive at the correct outcomes.
Data Science at Your Desk
The Web Analytics Challenge includes five sections that simulate the experience of exploring, then cleaning, and ultimately analyzing web log data. First, you will work through some of the common issues a data scientist encounters with log data and data in JSON format. Second, you will clean and prepare the data for modeling. Third, you will develop an alternate approach to building a classifier, with a focus on data structure and accuracy. Fourth, you will learn how to use tools like Cloudera ML to discover clusters within a data set. Finally, you will select an optimal recommender algorithm and extract ratings predictions using Apache Mahout.
With the ongoing confusion about what it means to be a “data scientist,” having a certification or two isn’t going to hurt your chances for employment.
And you may learn something in the bargain.