2.0

Archive for July, 2006|Monthly archive page

Search By Meaning

In Uncategorized on July 29, 2006 at 8:28 am

I’ve been working on a detailed technical scheme for a “search by meaning” search engine (as opposed to [dumb] Google-like search by keyword) and I have to say that in conquering the workability challenge in my limited scope I can see the huge problem facing Google and other Web search engines in transitioning to a “search by meaning” model.

Related

  1. Wikipedia 3.0: The End of Google?
  2. P2P 3.0: The People’s Google
  3. Intelligence (Not Content) is King in Web 3.0
  4. Web 3.0 Blog Application
  5. Towards Intelligent Findability
  6. All About Web 3.0

Tags:

Semantic Web, Web strandards, Trends, OWL, innovation, Startup, Evolution, Google, inference engineWeb 2.0, Web 2.0Web 3.0, AI, Wikipedia, Wikipedia 3.0, Info Agent, Semantic MediaWiki, DBin, P2P 3.0, P2P AIP2P Semantic Web inference Engine, intelligent findability, search by meaning

Advertisement

The Geek VC Fund Project: 7/26 Update

In Uncategorized on July 25, 2006 at 10:55 pm

This post is an update to the original post about the Geek-Run, Geek-Funded Venture Capital Fund.

  1. The idea has evolved by leaps and bounds.
  2. First project to be funded within 4-5 months.
  3. The framework will be socialized with the public at large when we have the first fruits of our labor.

Tags:

Web 2.0, Web 2.0, venture capital, venture capital, VC, entrepreneur, funding, private equity, geek, seed funding, early stage, Startup

WordPress Logo Competition

In Uncategorized on July 20, 2006 at 10:49 pm

I’d like to call your attention to the WordPress logo competition going on in the Ideas forums on WordPress.com

Here is my entry for all the w^P fans:

Stolen art. That’s for sure.

Beats

  1. Red Star Over Qubah

Google dont like Web 3.0 [sic]

In Uncategorized on July 20, 2006 at 11:40 am

Why am I not surprised?

Google exec challenges Berners-Lee

The idea is that the Semantic Web will allow people to run AI-enabled P2P Search Engines that will collectively be more powerful than Google can ever be, which will relegate Google to just another source of information, especially as Wikipedia [not Google] is positioned to lead the creation of domain-specific ontologies, which are the foundation for machine-reasoning [about information] in the Semantic Web.

Additionally, we could see content producers (including bloggers) creating informal ontologies on top of the information they produce using a standard language like RDF. This would have the same effect as far as P2P AI Search Engines and Google’s anticipated slide into the commodity layer (unless of course they develop something like GWorld)

In summary, any attempt to arrive at widely adopted Semantic Web standards would significantly lower the value of Google’s investment in the current non-semantic Web by commoditizing “findability” and allowing for intelligent info agents to be built that could collaborate with each other to find answers more effectively than the current version of Google, using “search by meaning” as opposed to “search by keyword”, as well as more cost-efficiently than any future AI-enabled version of Google, using disruptive P2P AI technology.

For more information, see the articles below.

Related

  1. Wikipedia 3.0: The End of Google?
  2. All About Web 3.0
  3. P2P 3.0: The People’s Google
  4. Intelligence (Not Content) is King in Web 3.0
  5. Web 3.0 Blog Application
  6. Towards Intelligent Findability
  7. Why Net Neutrality is Good for Web 3.0

Somewhat Related

  1. Is Google a Monopoly?

Tags:

Semantic Web, Web strandards, Trends, OWL, Googleinference engine, AIWeb 2.0, Web 3.0AI, Wikipedia, Wikipedia 3.0, , Info Agent, Semantic MediaWiki, DBin, P2P 3.0, P2P AI, P2P Semantic Web inference Engine, semantic blog, intelligent findability, RDF

Towards Intelligent Findability

In Uncategorized on July 19, 2006 at 9:09 am

A lot of buzz about Web 3.0 and Wikipedia 3.0 has been generated lately from this blog, so I’ve decided that for my guest post here I’d like to dive into this idea and take a look at how we’d build a Semantic Content Management System (CMS).

Objective

We want a CMS capable of building a knowledge base (that is a set of domain-specific ontologies) with formal deductive reasoning capabilities.

Requirements

  1. A semantic CMS framework.
  2. An ontology API.
  3. An inference engine.
  4. A framework for building info-agents.

HOW-TO

The general idea would be something like this:

  1. Users use a semantic CMS like Semantic MediaWiki to enter information as well as semantic annotations (to establish semantic links between concepts in the given domain on top of the content) This typically produces an informal ontology on top of the information, which, when combined with domain inference rules and the query structures (for the particular schema) that are implemented in an independent info agent or built into the CMS, would give us a Domain Knowledge Database. (Alternatively, we can have users enter information into a non-semantic CMS to create content based on a given doctype or content schema and then front-end it with an info agent that works with a formal ontology of the given domain, but we would then need to perform natural language processing, including using statistical semantic models, since we would lose the certainty that would normally be provided by the semantic annotations that, in a Semantic CMS, would break down the natural language in the information to a definite semantic structure.)
  2. Another set of info agents adds to our knowledge base inferencing-based querying services for information on the Web or other domain-specific databases. User entered information plus information obtained from the web makes up our Global Knowledge Database.
  3. We provide a Web-based interface for querying the inference engine.

Each doctype or schema (depending on the CMS of your choice) will have a more or less direct correspondence with our ontologies (i.e. one schema or doctype maps with one ontology). The sum of all the content of a particular schema makes up a knowledge-domain which when transformed into a semantic model like (RDF or more specifically OWL) and combined with the domain inference rules and the query structures (for the particular schema) constitute our knowledge database. The choice of CMS is not relevant as long as you can query its contents while being able to define schemas. What is important is the need for an API to access the ontology. Luckily projects like JENA fills this void perfectly providing both an RDF and an OWL API for Java.

In addition, we may want an agent to add or complete our knowledge base using available Web Services (WS). I’ll assume you’re familiarized with WS so I won’t go into details.

Now, the inference engine would seem like a very hard part. It is. But not for lack of existing technology: the W3C already have a recommendation language for querying RDF (viz. a semantic language) known as SPARQL (http://www.w3.org/TR/rdf-sparql-query/) and JENA already has a SPARQL query engine.

The difficulty lies in the construction of ontologies which would have to be formal (i.e. consistent, complete, and thoroughly studied by experts in each knowledge-domain) in order to obtain powerful deductive capabilities (i.e. reasoning).

Conclusion

We already have technology powerful enough to build projects such as this: solid CMS, standards such as RDF, OWL, and SPARQL as well as a stable framework for using them such as JENA. There are also many frameworks for building info-agents but you don’t necessarily need a specialized framework, a general software framework like J2EE is good enough for the tasks described in this post.

All we need to move forward with delivering on the Web 3.0 vision (see 1, 2, 3) is the will of the people and your imagination.

Addendum

In the diagram below, the domain-specific ontologies (OWL 1 … N) could be all built by Wikipedia (see Wikipedia 3.0) since they already have the largest online database of human knowledge and the domain experts among their volunteers to build the ontologies for each domain of human knowledge. One possible way is for Wikipedia will build informal ontologies using Semantic MediaWiki (as Ontoworld is doing for the Semantic Web domain of knowledge) but Wikipedia may wish to wait until they have the ability to build formal ontologies, which would enable more powerful machine-reasoning capabilities.

[Note: The ontologies simply allow machines to reason about information. They are not information but meta-information. They have to be formally consistent and complete for best results as far as machine-based reasoning is concerned.]

However, individuals, teams, organizations and corporations do not have to wait for Wikipedia to build the ontologies. They can start building their own domain-specific ontologies (for their own domains of knowledge) and use Google, Wikipedia, MySpace, etc as sources of information. But as stated in my latest edit to Eric’s post, we would have to use natural language processing in that case, including statistical semantic models, as the information won’t be pre-semanticized (or semantically annotated), which makes the task more dificult (for us and for the machine …)

What was envisioned in the Wikipedia 3.0: The End of Google? article was that since Wikipedia has the volunteer resources and the world’s largest database of human knowledge then it will be in the powerful position of being the developer and maintainer of the ontologies (including the semantic annotations/statements embedded in each page) which will become the foundation for intelligence (and “Intelligent Findability”) in Web 3.0.

This vision is also compatible with the vision for P2P AI (or P2P 3.0), where users run P2P inference engines on their PCs that communicate and collaborate with each other and that tap into information form Google, Wikipedia, etc, which will ultimately push Google and central search engines down to the commodity layer (eventually making them a utility business just like ISPs.)

Diagram

Related

  1. Wikipedia 3.0: The End of Google? June 26, 2006
  2. Wikipedia 3.0: El fin de Google (traducción) July 12, 2006
  3. Web 3.0: Basic Concepts June 30, 2006
  4. P2P 3.0: The People’s Google July 11, 2006
  5. Why Net Neutrality is Good for Web 3.0 July 15, 2006
  6. Intelligence (Not Content) is King in Web 3.0 July 17, 2006
  7. Web 3.0 Blog Application July 18, 2006
  8. Semantic MediaWiki July 12, 2006
  9. Get Your DBin July 12, 2006

 

Tags:

Semantic Web, Web strandards, Trends, OWL, innovation, Startup, Google, GData, inference engine, AI, ontology, Semantic Web, Web 2.0, Web 2.0, Web 3.0, Web 3.0, Google Base, artificial intelligence, AI, Wikipedia, Wikipedia 3.0, Ontoworld, Wikipedia AI, Info Agent, Semantic MediaWiki, DBin, P2P 3.0, P2P AI, P2P Semantic Web inference Engine, semantic blog, intelligent findability, JENA, SPARQL, RDF, OWL

All About Web 3.0

In Uncategorized on July 18, 2006 at 3:29 pm

Please see the Web 3.0 section.

 

Tags:

Semantic Web, Web strandards, Trends, OWL, innovation, Startup, Evolution, Google, GData, inference, inference engine, AI, ontology, Semanticweb, Web 2.0, Web 2.0, Web 3.0, Web 3.0, Google Base, artificial intelligence, AI, Wikipedia, Wikipedia 3.0, collective consciousness, Ontoworld, Wikipedia AI, Info Agent, Semantic MediaWiki, DBin, P2P 3.0, P2P AI, AI Matrix, P2P Semantic Web inference Engine, semantic blog, intelligent findability

Semantic Blog

In Uncategorized on July 17, 2006 at 9:02 pm

Author: Marc Fawzi

Twitter: http://twitter.com/#!/marcfawzi

License: Attribution-NonCommercial-ShareAlike 3.0

Background

As concluded in my previous post there’s an exponetial growth in the amount of user-generated content (videos, blogs, photos, P2P content, etc).

The enormous amount of free content available today is just too much for the current “dumb search” technology that is used to access it.

I believe that content is now a commodity and the next layer of value is all about “Intelligent Findability.”

Take my blog for example, it’s less than 60 days old, and I’ve never blogged before, but as of today it already has ~500 RSS daily subscribers (and growing), with a noticeable increase after the iPod post I made 3 days ago, 6,281 incoming links (according to MSN) and ~70,000 page views in total so far (mostly due to the Wikipedia 3.0 post, which according to Alexa.com reached an estimated ~2M people.) That demonstrates the potential of blogs to generate and spread lots of content.

So there is a lot of blog-generated content (if you consider how many bloggers are out there) and that doesn’t even include the hundreds of thousands (or millions?) of videos and photos uploaded daily to YouTube, Google Video, Flickr and all those other video and photo sharing sites. It also doesn’t include the 30% of total Internet bandwidth being sucked up by BitTorrent clients.

There’s just too much content and no seriously effective way to find what you need. Google is our only hope for now but Google is rudimentary compared to the vision of Semantic-Web Info Agents expressed in the Wikipedia 3.0 and Web 3.0 articles.

Idea

We’d like to embed “Intelligent Findability” into a blogging application so that others will be able to get the most of the information, ideas and analyses we generate.
If you do a search right now for “cool consumer idea” you will not get the iPod post. Instead you will get this post, but that is because I’m specifically making the association between “cool consumer idea” and “iPod” in this post.

Google tries to get around the debilitating limitation of keyword-based search engine technology in the same way by letting people associate phrases or words with a given link. If enough people linked to the iPod post and put the words “cool consumer idea” in the link then when searching Google for “cool consumer idea” you will see the iPod post. However, unless people band together and decide to call it a “cool consumer idea” it won’t show up in the search results. You would have to enter something like “portable music application” (which is actually one of the search results that showed up on my WordPress dashboard today.)

Using Semantic MediaWiki (which allows domain experts to embed semantic annotations into the information) I could insert semantic annotations to semantically link concepts in the information on this blog that would build an ontology that defines semantic relationships between terms in the information (i.e. meaning) where “iPod” would be semantically related to “product” which would be semantically related to “consumer electronics” and where the sentence Portable Music Studio would be semantically related (through use of annotations) to “vision”, “idea”, “concept”, “entertainment”, “music”, “consumer electronics”, “mp3 player” and so on, while the “iPod” would be also semantically related to “cool” (as in what is “cool”?) Thus, using rules of inference for my domain of knowledge I should able to deliver an intelligent search capability that deductively reasons the best match to a search query, based on matching the deduced meanings (represented as semantic graphs) from the user’s query and the information.

The quality of the deductive capability would depend on the consistency and completeness of the semantic annotations and the pan-domain or EvolvingTrends-domain ontology that I would build, among other factors. But generally speaking, since the ontology and the semantic annotations would be built by me if we think alike (or have a fairly similar semantic model of the world) then you will not only be able to read my blog but you will be able to read my mind. The idea is that, with my help in supplying the semantic annotations, such system will be able to deduce possible meaning (as a graph of semantic relationships) out of each sentence in the post and respond to search queries by reasoning about meaning rather than matching keywords.

This is possible with Semantic MediaWiki (which is under development) However, in this particular instance, I don’t want a Semantic Wiki. I want a Semantic Blog. But that should be just a simple step away.

Related

  1. Wikipedia 3.0: The End of Google?
  2. Towards Intelligent Findability
  3. Web 3.0: Basic Concepts
  4. Intelligence (Not Content) is King in Web 3.0
  5. Semantic MediaWiki

Tags:

semantic web, Web 3.0, Semantic MediaWiki, semantic web, semantic blog, intelligent findability, inference engine

Intelligence (Not Content) is King in Web 3.0

In Uncategorized on July 17, 2006 at 2:35 pm

Author: Marc Fawzi

Twitter: http://twitter.com/#!/marcfawzi

License: Attribution-NonCommercial-ShareAlike 3.0

Observation

  1. There’s an enormous amount of free content on the Web.
  2. Pirates will aways find ways to share copyrighted content, i.e. get content for free.
  3. There’s an exponential growth in the amount of free, user-generated content.
  4. Net Neutrality (or the lack of a two-tier Internet) will only help ensure the continuance of this trend.
  5. Content is is becoming so commoditized that it only costs us the monthly ISP fee to access.

Conslusions (or Hypotheses)

The next value paradigm in the content business is going to be about embedding “intelligent findability” into the content layer, by using a semantic CMS (like Semantic MediaWiki, that enables domain experts to build informal ontologies [or semantic annotations] on top of the information) and by adding inferencing capabilities to existing search engines. I know this represents less than the full vision for Web 3.0 as I’ve outlined in the Wikipedia 3.0 and Web 3.0 articles but it’s a quantum leap above and beyond the level of intelligence that exists today within the content layer. Also, semantic CMS can be part of P2P Semantic Web Inference Engine applications that would push central search model’s like Google’s a step closer to being a “utility” like transport, unless Google builds their own AI, which would ultimately have to compete with P2P semantic search engines (see: P2P 3.0: The People’s Google and Get Your DBin.)

In other words, “intelligent findability” NOT content in itself will be King in Web 3.0.

Related

  1. Towards Intelligent Findability
  2. Wikipedia 3.0: The End of Google?
  3. Web 3.0: Basic Concepts
  4. P2P 3.0: The People’s Google
  5. Why Net Neutrality is Good for Web 3.0
  6. Semantic MediaWiki
  7. Get Your DBin

Tags:

net neutrality, two-tier internet, content, Web 3.0, inference engine, semantic-web, artificial intelligence, ai

Why Net Neutrality is Good for Web 3.0

In Uncategorized on July 15, 2006 at 1:30 pm

(this post was last updated at 10:00am EST, July 22, ’06)

Facts

1. Telcos and Cable companies in the US are legally disallowed from blocking other carriers’ VoIP traffic. Last year, the FCC fined a North Carolina CLEC for doing that to Vonage.

2. Telcos and Cable companies have been in a turf war ever since cable companies started offering Internet access. This turf war escalated after cable companies started offering VoIP phone service, thus cutting deeply into the telcos’ main revenue stream.

3. The telcos’ response to the Cable companies’ entry into the phone market is to roll out their own TV services, based on IPTV (TV over IP), which are being rolled out at the speed of local and state government bureaucracies. IPTV would be carried on DSL lines, FTTC or FTTH.

4. The telcos’ response to Skype, Vonage, Yahoo IM (with VoIP) as well as their response to YouTube (and Google Video), who combinedly threaten the Telcos’ business model in the phone service and video delivery areas, was their push for a two-tiered internet, where the telcos, who happen to own the Internet backbones, would de-prioritize VoIP and video traffic from Skype, Vonage, YouTube, Google, Yahoo and others.

Net Neutrality

The telcos already charge the end user (in case they serve the end user directly) and the cable companies (for use of their backbone when traffic has to travel outside of the cable company’s own network.)

So I just don’t see why the telcos would have to charge the cable companies, Google, YouTube, Yahoo, Vonage, Skype, MSN, etc one more time.

The telcos’ backbones are not being used for free. They are either paid for by the telco’s users (if the telco is the ISP) or by the cable companies and CLECs using those backbones, who pass the cost to their users. So it’s us, the end users, who are paying for those backbones, not the telcos as the telcos make it sound like.

But it seems that the telcos are saying that they’re not charging enough for those backbones to ensure continued investment on their part in growing their backbone capacities and instead of increasing how much they charge for traffic, which would increase our monthly access fees, they’re suggesting to charge the heavy content providers (e.g. YouTube, Google, others) for high-priority traffic (e.g. VoIP, video streams) and do the same to the VoIP transport providers (e.g. Skype, Vonage, etc.)

Google, Skype, Yahoo, MSN and others, seeing how that would hurt their business interests and the interest of their users by forcing them to charge users for content and VoIP transport, have sponsored a Net Neutrality bill, which to the best of my knowledge has had a hard time going through Congress and the Senate.

Two Tier Internet

The telcos are struggling against the inevitable: that they will be a commodity industry like the railroad or trucking industries. The telcos, who understand all of the above, do not want to be confined to the transport of traffic because the transport business has become a commodity.

The same argument applies to VoIP transport providers. VoIP transport has become (or is becoming) a commodity business.

And if you ask me, “content” is also becoming a commodity business since the huge and ever-growing number of news, analysis and entertainment blogs, the millions of people who contribute their home videos, the pirates who can always figure out ways to share copyrighted content, and the tons of yet-to-be-explored opportunities for user-generated content all mean that content is now officially commoditized. In fact, content is so commoditized all it costs now is the small monthly fee users pay their ISP to access the net.

The Two-Tier Internet is an attempt by the telcos to attach artificially enhanced value to content once again by making content producers pay them for delivering their content without jitters and delays. It is also an attempt to attach artificially enhanced value to transport by forcing VoIP transport providers like Skype, Vonage, Yahoo etc to pay them to have their VoIP traffic transported without jitters and delays.

The Two-Tier Internet, aka the attempt by the telcos to attach artificially enhanced value to content and transport seems anti-progress and simply going nowhere.

However, the question is who will pay to invest in new backbone capacity? The answer (or part of the answer) is that content providers like Google are investing in building thier own networks (between their data centers) and such efforts can conceivably grow into new backbone investments, where Google, Yahoo, AOL et al would be investing in new network capacity growth.

If Content has Become a Commodity Then How Will Content and Transport Providers Deliver Genuine Enhanced Value?

The answer that I propose is by embedding intelligent findability (forget keyword –and tag– indexed information, think Web 3.0!) into their Ad-supported content layer.

So instead of “dumb search” (which gives us “dumb content”) we would embrace the Web 3.0 model of intelligent findability (i.e. allowing the machines to use information in an intelligent manner to find what we’re looking for.)

No wonder Tim Berners-Lee (the father of the Web and the originator of the “Semantic Web,” which I had popularized as Web 3.0 in the Wikipedia 3.0 article) has come out strongly in favor of net neutrality. Having said that, I’m not sure whether or not he would agree that the the natural commoditization of “dumb content,” which would be assured continuance under Net Neutrality, would help us get to the Web 3.0 model of intelligent findability sooner than if there was to be a two-tier Internet. The latter, in my opinion, would slow down the commoditization of ‘dumb content’, thus giving value-driven innovators less reason to explore the next layer of value in the content business, which I’m proposing is the Web 3.0 model of intelligent findability.

Related

  1. Towards Intelligent Findability
  2. Wikipedia 3.0: The End of Google?
  3. Intelligence (Not Content) is King in Web 3.0

Tags:

net neutrality, two-tier internet, content, Web 3.0, VoIP transport, VoIP, IPTV, Semantic Web

iPod As A Portable Music Studio

In Uncategorized on July 14, 2006 at 11:16 am

Idea

Build a smaller-sized version of Apple’s GarageBand music making software right into the iPod.

Why?

So we can make our own tunes man!

And remix “A Cappellas” with our own beats!

And sell our own productions on iTunes!

It’s all about user-generated content …

When will the iPod jump on the Web 2.0 bandwagon?

But Why?

Because that would totally rock!

Can it be Done?

In ’02/’03 I invested in a project where we made a version of GarageBand for the mobile platforms. The prototype worked fine (with 8 tracks, real-time BPM matching and anti-clipping) but in 2003 the VCs had left town and no one was investing :D

The iPod (and specially the Video iPod) uses a much more powerful processor than the Gameboy Advance. So it should be able to go up to 16 tracks (or more) and have complex synthesizers, drum machines and sound effect generators (e.g. see FruityLoops) so users can make killer loops (aka “samples”)! Users would hunt for and gather samples (i.e. trade them on forums, blogs, etc) as well as open source their amateur productions.

Now that’s what I call impulsive consumption and production!

And You Don’t Have to Wait for Apple to Do it!

Check out Rockbox. They don’t have it yet but I don’t see why they couldn’t build it for the iPod.

P.S. This post was not written to generate traffic :P

Tags:

ipod, apple, music, itunes, mp3, mp3 player

Wikipedia 3.0: El fin de Google (traducción)

In Uncategorized on July 12, 2006 at 4:08 pm

Wikipedia 3.0: El fin de Google (traducción)

por Evolving Trends

Versión española (por Eric Rodriguez de Toxicafunk)

La Web Semántica (o Web 3.0) promete “organizar la información mundial” de una forma dramáticamente más lógica que lo que Google podría lograr con su diseño de motor actual. Esto es cierto desde el punto de vista de la comprensión por parte de las maquinas versus la humana. La Web Semántica requiere del uso de un lenguaje ontológico declarativo, como lo es OWL, para producir ontologías específicas de dominio que las máquinas pueden usar para razonar sobre la información y de esta forma alcanzar nuevas conclusiones, en lugar de simplemente buscar / encontrar palabras claves.

Sin embargo, la Web Semántica, que se encuentra todavía en una etapa de desarrollo en la que los investigadores intentan definir que modelo es el mejor y cual tiene mayor usabilidad, requeriría la participación de miles de expertos en distintos campos por un periodo indefinido de tiempo para poder producir las ontologías específicas de dominio necesarias para su funcionamiento.

Las maquinas (o más bien el razonamiento basado en maquinas, también conocido como Software IA o ‘agentes de información’) podrían entonces usar las laboriosas –mas no completamente manuales- ontologías elaboradas para construir una vista (o modelo formal) sobre como los términos individuales, en un determinado conjunto de información, se relacionan entre sí. Tales relaciones se pueden considerar como axiomas (premisas básicas), que junto con las reglas que gobiernan el proceso de inferencia permiten a la vez que limitan la interpretación (y el uso correctamente-formado) de dichos términos por parte de los agentes de información, para poder razonar nuevas conclusiones basándose en la información existente, es decir, pensar. En otras palabras, se podría usar software para generar teoremas (proposiciones formales demostrables basadas en axiomas y en las reglas de inferencia), permitiendo así el razonamiento deductivo formal a nivel de máquinas. Y dado que una ontología, tal como se describe aquí, se trata de un enunciado de Teoría Lógica, dos o más agentes de información procesando la misma ontología de un dominio específico serán capaces de colaborar y deducir la respuesta a una query (búsqueda o consulta a una base de datos), sin ser dirigidos por el mismo software.

De esta forma, y como se ha establecido, en la Web Semántica los agentes basados en maquina (o un grupo colaborador de agentes) serán capaces de entender y usar la información traduciendo conceptos y deduciendo nueva información en lugar de simplemente encontrar palabras clave.

Una vez que las máquinas puedan entender y usar la información, usando un lenguaje estándar de ontología, el mundo nuca volverá a ser el mismo. Será posible tener un agente de información (o varios) entre tu ‘fuerza laboral‘ virtual aumentada por IA, cada uno teniendo acceso a diferentes espacios de dominio especifico de comprensión y todos comunicándose entre si para formar una conciencia colectiva.

Podrás pedirle a tu agente o agentes de información que te encuentre el restaurante más cercano de cocina Italiana, aunque el restaurante más cercano a ti se promocione como un sitio para Pizza y no como un restaurante Italiano. Pero este es solo un ejemplo muy simple del razonamiento deductivo que las máquinas serán capaces de hacer a partir de la información existente.

Implicaciones mucho más sorprendentes se verán cuando se considere que cada área del conocimiento humano estará automáticamente al alcance del espacio de comprensión de tus agentes de información. Esto es debido a que cada agente se puede comunicar con otros agentes de información especializados en diferentes dominios de conocimiento para producir una conciencia colectiva (usando la metáfora Borg) que abarca todo el conocimiento humano. La “mente” colectiva de dichos agentes-como-el-Borg conformara la Maquina Definitiva de Respuestas, desplazando fácilmente a Google de esta posición, que no ocupa enteramente.

El problema con la Web Semántica, aparte de que los investigadores siguen debatiendo sobre que diseño e implementación de modelo de lenguaje de ontología (y tecnologías asociadas) es el mejor y el más usable, es que tomaría a miles o incluso miles de miles de personas con vastos conocimientos muchos años trasladar el conocimiento humano a ontologías especificas de dominio.

Sin embargo, si en algún punto tomáramos la comunidad Wikipedia y les facilitásemos las herramientas y los estándares adecuados con que trabajar (sean estos existentes o a desarrollar en el futuro), de forma que sea posible para individuos razonablemente capaces reducir el conocimiento humano en ontologías de dominios específicos, entonces el tiempo necesario para hacerlo se vería acortado a unos cuantos años o posiblemente dos

El surgimiento de una Wikipedia 3.0 (en referencia a Web 3.0, nombre dado a la Web Semántica) basada en el modelo de la Web Semántica anunciaría el fin de Google como la Maquina Definitiva de Respuestas. Este sería remplazado por “WikiMind” (WikiMente) que no sería un simple motor de búsqueda como Google sino un verdadero Cerebro Global: un poderoso motor de inferencia de dominios, con un vasto conjunto de ontologías (a la Wikipedia 3.0) cubriendo todos los dominios de conocimiento humano, capaz de razonar y deducir las respuestas en lugar de simplemente arrojar cruda información mediante el desfasado concepto de motor de búsqueda.

Notas
Tras escribir el post original descubrí que la aplicación Wikipedia, también conocida como MeadiaWiki que no ha de confundirse con Wikipedia.org, ya ha sido usado para implementar ontologías. El nombre que han seleccionado es Ontoworld. Me parece que WikiMind o WikiBorg hubiera sido un nombre más atractivo, pero Ontoworld también me gusta, algo así como “y entonces descendió al mundo,” (1) ya que se puede tomar como una referencia a la mente global que un Ontoworld capacitado con la Web Semántica daría a lugar.

En tan solo unos cuantos años la tecnología de motor e búsqueda que provee a Google casi todos sus ingresos/capital, seria obsoleta… A menos que tuvieran un contrato con Ontoworld que les permitiera conectarse a su base de datos de ontologías añadiendo así la capacidad de motor de inferencia a las búsquedas de Google.

Pero lo mismo es cierto para Ask,com y MSN y Yahoo.

A mi me encantaría ver más competencia en este campo, y no ver a Google o cualquier otra compañía establecerse como líder sobre los otros.

La pregunta, usando términos Churchilianos, es si la combinación de Wikipedia con la Web Semántica significa el principio del fin para Google o el fin del principio. Obviamente, con miles de billones de dólares con dinero de sus inversionistas en juego, yo opinaría que es lo último. Sin embargo, si me gustaría ver que alguien los superase (lo cual es posible en mi opinión).

(1) El autor hace referencia al juego de palabra que da el prefijo Onto de ontología que suena igual al adverbio unto en ingles. La frase original es “and it descended onto the world,”.

Aclaración
Favor observar que Ontoworld, que implementa actualmente las ontologías, se basa en la aplicación “Wikipedia” (también conocida como MediaWiki) que no es lo mismo que Wikipedia.org.

Así mismo, espero que Wikipedia.org utilice su fuerza de trabajo de voluntarios para reducir la suma de conocimiento humano que se ha introducido en su base de datos a ontologías de dominio específico para la Web Semántica (Web 3.0) y por lo tanto, “Wikipedia 3.0”.

Respuesta a Comentarios de los Lectores
Mi argumento es que Wikipedia actualmente ya cuenta con los recursos de voluntarios para producir las ontologías para cada uno de los dominios de conocimiento que actualmente cubre y que la Web Semántica tanto necesita, mientras que Google no cuenta con tales recursos, por lo que dependería de Wikipedia.

Las ontologías junto con toda la información de la Web, podrán ser accedidas por Google y los demás pero será Wikipedia quien quede a cargo de tales ontologías debido a que actualmente Wikipedia ya cubre una enorme cantidad de dominios de conocimiento y es ahí donde veo el cambio en el poder.

Ni Google ni las otras compañías posee el recurso humano (los miles de voluntarios con que cuenta Wikipedia) necesario para crear las ontologías para todos los dominios de conocimiento que Wikipedia ya cubre. Wikipedia si cuenta con tales recursos y además esta posicionada de forma tal que puede hacer trabajo mejor y más efectivo que cualquier otro. Es difícil concebir como Google lograría crear dichas ontologías (que crecen constantemente tanto en numero como en tamaño) dado la cantidad de trabajo que se requiere. Wikipedia, en cambio, puede avanzar de forma mucho más rápida gracias a su masiva y dedicada fuerza de voluntarios expertos.

Creo que la ventaja competitiva será para quien controle la creación de ontologías para el mayor numero de dominios de conocimiento (es decir, Wikipedia) y no para quien simplemente acceda a ellas (es decir, Google).

Existen muchos dominios de conocimiento que Wikipedia todavía no cubre. En esto Google tendría una oportunidad pero solamente si las personas y organizaciones que producen la información hicieran también sus propias ontologías, tal que Google pudiera acceder a ellas a través de su futuro motor de Web Semántica. Soy de la opinión que esto será así en el futuro pero que sucederá poco a poco y que Wikipedia puede tener listas las ontologías para todos los dominios de conocimiento con que ya cuenta mucho más rápido además de contar con la enorme ventaja de que ellos estarían a cargo de esas ontologías (la capa básica para permitir la IA).

Todavía no esta claro, por supuesto, si la combinación de Wikipedia con la Web Semántica anuncia el fin de Google o el fin del principio. Como ya mencioné en el artículo original. Me parece que es la última opción, y que la pregunta que titula de este post, bajo el presente contexto, es meramente retórica. Sin embargo, podría equivocarme en mi juicio y puede que Google de paso a Wikipedia como la maquina definitiva de respuestas mundial.

Después de todo, Wikipedia cuenta con “nosotros”. Google no. Wikipedia deriva su de poder de “nosotros”. Google deriva su poder de su tecnología y su inflado precio de mercado. ¿Con quien contarías para cambiar el mundo?

Respuesta a Preguntas Básicas por parte de los Lectores
El lector divotdave formulá unas cuantas preguntas que me parecen de naturaleza básica (es decir, importante). Creo que más personas se estarán preguntando las mismas cuestiones por lo que las incluyo con sus respectivas respuestas.

Pregunta:
¿Como distinguir entre buena y mala información? Como determinar que partes del conocimiento humano aceptar y que parte rechazar?

Respuesta:
No es necesario distinguir entre buena y mala información (que no ha de confundirse con bien-formada vs. mal-formada) si se utiliza una fuente de información confiable (con ontologías confiables asociadas). Es decir, si la información o conocimiento que se busca se puede derivar de Wikipedia 3.0, entonces se asume que la información es confiable.

Sin embargo, con respecto a como conectar los puntos al devolver información o deducir respuestas del inmenso mar de información que va más allá de Wikipedia, entonces la pregunta se vuelve muy relevante. Como se podría distinguir la buena información de la mala de forma que se pueda producir buen conocimiento (es decir, comprender información o nueva información producida a través del razonamiento deductivo basado en la información existente).

Pregunta:
Quien, o qué según sea el caso, determina que información es irrelevante para mí como usuario final?

Respuesta:
Esta es una buena pregunta que debe ser respondida por los investigadores que trabajan en los motores IA para la Web 3.0.

Será necesario hacer ciertas suposiciones sobre que es lo que se está preguntando. De la misma forma en que tuve que suponer ciertas cosas sobre lo que realmente me estabas preguntando al leer tu pregunta, también lo tendrán que hacer los motores IA, basados en un proceso cognitivo muy similar al nuestro, lo cual es tema para otro post, pero que ha sido estudiado por muchos investigadores IA.

Pregunta:
¿Significa esto en última instancia que emergerá un todopoderoso* estándar al cual toda la humanidad tendrá que adherirse (por falta de información alternativa)?

Respuesta:
No existe la necesidad de un estándar, excepto referente al lenguaje en el que se escribirán las ontologías (es decir, OWL, OWL-DL. OWL Full, etc.). Los investigadores de la Web Semántica intentan determinar la mejor opción, y la más usable, tomando en consideración el desempeño humano y de las máquinas al construir y –exclusivamente en el último caso- interpretar dichas ontologías.

Dos o más agentes de información que trabajen con la misma ontología especifica de dominio pero con diferente software (diferente motor IA) pueden colaborar entre ellos. El único estándar necesario es el lenguaje de la ontología y las herramientas asociadas de producción.

Anexo

Sobre IA y el Procesamiento del Lenguaje Natural

Me parece que la primera generación de IA que será usada por la Web 3.0 (conocido como Web Semántica) estará basada en motores de inferencia relativamente simples (empleando enfoques tanto algorítmicos como heurísticas) que no intentarán ningún tipo de procesamiento de lenguaje natural. Sin embargo, si mantendrán las capacidades de razonamiento deductivo formal descritas en este articulo.

Sobre el debate acerca de La Naturaleza y Definición de IA

La introducción de la IA en el ciber-espacio se hará en primer lugar con motores de inferencia (usando algoritmos y heurística) que colaboren de manera similar al P2P y que utilicen ontologías estándar. La interacción paralela entre cientos de millones de Agentes IA ejecutándose dentro de motores P2P de IA en las PCs de los usuarios dará cabida al complejo comportamiento del futuro cerebro global.

ViRAL Text

In Uncategorized on July 12, 2006 at 11:11 am

This post has been replaced by the following link which contains an up-to-date list of all the ‘viral’ articles:

https://evolvingtrends.wordpress.com/viral/

Tags:

Web 2.0, Web 2.0, Where 2.0, Where 2.0, social networking, Trends, Who 2.0, tagging, Startup, Semantic Web, Web strandards, Trends, OWLinference engine, AI, ontology, Semanticweb, Web 2.0Web 3.0AI, Wikipedia, Wikipedia 3.0, wisdom of crowds, tagging, mass psychology, cult psychology, digg, censorship, P2P, P2P 2.0, social networking, Web 2.5, governance, Internet governance, pattern recognition, non-linear feedback loop, neural network, prediction markets, e-society, national security, economy, political science, cultural phenomenon

Get Your DBin

In Uncategorized on July 12, 2006 at 9:06 am

Upon very quick glance, DBin seems to be about people (or rather ‘domain experts’) building the semantic annotations (informal ontologies), inference rules and query structures. The last three pieces I thought would be specified by the inference engine vendors but I believe that DBin let’s any person who qualifies as a domain expert add value!

Related

  1. P2P 3.0: The People’s Google

Tags:

Semantic Web, Web strandards, Trends, OWL, innovation, Startup, Google, ontology, Semanticweb, Web 3.0, Web 3.0, Wikipedia, Wikipedia 3.0, Ontoworld, OWL-DL, OWL, DBin, Semantic MediaWiki, P2P 3.0

Semantic MediaWiki

In Uncategorized on July 12, 2006 at 6:01 am

What is it?

Semantic MediaWiki is an ongoing open source project to develop a Semantic Wiki Engine.

In other words, it is one of the impportant early innovations leading up to the Wikipedia 3.0 (Web 3.0) vision.

  • The project and software is called "Semantic MediaWiki"
  • ontoworld.org is just one site using the technology
  • Wikipedia might become another site using the technology 

Update

The hosting of the Semantic Mediawiki, i.e. the Web 3.0 version of of Wikipedia’s platform, has been taken over by Wikia, a commercial venture founded by Wikiepdia’s own founder Jimmy Wales. This opens up a huge conflict of interest, which is, namely, the fact that Wikipedia’s founder is running a commercial venture that takes creative improvements to Wikipedia’s platform, e.g. Semantic Mediawiki, and transfer those improvements to Wikia, Jimmy Wales’ own commercial for-profit venture.

Related

  1. Wikipedia 3.0: The End of Google?
  2. Web 3.0: Basic Concepts
  3. P2P 3.0: The People’s Google
  4. Semantic MediaWiki project website (as noted in the Update, Semantic Media Wiki hosting has been taken over by Wikipedia’s founder Jimmy Wales’ commercial venture Wikia…)

Tags:

Semantic Web, Web strandards, Trends, OWL, innovation, Startup, Evolution, Google, ontology, Semanticweb, Web 2.0, Web 2.0, Web 3.0, Web 3.0, Wikipedia, Wikipedia 3.0, Ontoworld, OWL-DL, Semantic MediaWiki, P2P 3.0

The People’s Google

In Uncategorized on July 11, 2006 at 10:16 am

Author: Marc Fawzi

Twitter: http://twitter.com/#!/marcfawzi

License: Attribution-NonCommercial-ShareAlike 3.0

/*

This is a follow-up to the Wikipedia 3.0 article.

See this article for a more disruptive ‘decentralized kowledgebase’ version of the model discussed here.

Also see this non-Web3.0 version: P2P to Destroy Google, Yahoo, eBay et al

Web 3.0 Developers:

Feb 5, ‘07: The following reference should provide some context regarding the use of rule-based inference engines and ontologies in implementing the Semantic Web + AI vision (aka Web 3.0) but there are better, simpler ways of doing it.

  1. Description Logic Programs: Combining Logic Programs with Description Logic

*/

In Web 3.0 (aka Semantic Web), P2P Inference Engines running on millions of users’ PCs and working with standardized domain-specific ontologies (that may be created by entities like Wikipedia and other organizations) using Semantic Web tools will produce an information infrastructure far more powerful than the current infrastructure that Google uses (or any Web 1.0/2.0 search engine for that matter.)

Having the sandardized ontologies and the P2P Semantic Web Inference Engines that work with those ontologies will lead to a more intelligent, “Massively P2P” version of Google.

Therefore, the emergence in Web 3.0 of said P2P Inference Engines combined with standardized domain-specific ontologies will present a major threat to the central “search” engine model.

Basic Web 3.0 Concepts

Knowledge domains

A knowledge domain is something like Physics, Chemistry, Biology, Politics, the Web, Sociology, Psychology, History, etc. There can be many sub-domains under each domain each having their own sub-domains and so on.

Information vs Knowledge

To a machine, knowledge is comprehended information (aka new information that is produced via the application of deductive reasoning to exiting information). To a machine, information is only data, until it is reasoned about.

Ontologies

For each domain of human knowledge, an ontology must be constructed, partly by hand and partly with the aid of dialog-driven ontology construction tools.

Ontologies are not knowledge nor are they information. They are meta-information. In other words, ontologies are information about information. In the context of the Semantic Web, they encode, using an ontology language, the relationships between the various terms within the information. Those relationships, which may be thought of as the axioms (basic assumptions), together with the rules governing the inference process, both enable as well as constrain the interpretation (and well-formed use) of those terms by the Info Agents to reason new conclusions based on existing information, i.e. to think. In other words, theorems (formal deductive propositions that are provable based on the axioms and the rules of inference) may be generated by the software, thus allowing formal deductive reasoning at the machine level. And given that an ontology, as described here, is a statement of Logic Theory, two or more independent Info Agents processing the same domain-specific ontology will be able to collaborate and deduce an answer to a query, without being driven by the same software.

Inference Engines

In the context of Web 3.0, Inference engines will be combining the latest innovations from the artificial intelligence (AI) field together with domain-specific ontologies (created as formal or informal ontologies by, say, Wikipedia, as well as others), domain inference rules, and query structures to enable deductive reasoning on the machine level.

Info Agents

Info Agents are instances of an Inference Engine, each working with a domain-specific ontology. Two or more agents working with a shared ontology may collaborate to deduce answers to questions. Such collaborating agents may be based on differently designed Inference Engines and they would still be able to collaborate.

Proofs and Answers

The interesting thing about Info Agents that I did not clarify in the original post is that they will be capable of not only deducing answers from existing information (i.e. generating new information [and gaining knowledge in the process, for those agents with a learning function]) but they will also be able to formally test propositions (represented in some query logic) that are made directly -or implied- by the user.

P2P 3.0 vs Google

If you think of how many processes currently run on all the computers and devices connected to the Internet then that should give you an idea of how many Info Agents can be running at once (as of today), all reasoning collaboratively across the different domains of human knowledge, processing and reasoning about heaps of information, deducing answers and deciding truthfulness or falsehood of user-stated or system-generated propositions.

Web 3.0 will bring with it a shift from centralized search engines to P2P Semantic Web Inference Engines, which will collectively have vastly more deductive power, in both quality and quantity, than Google can ever have (included in this assumption is any future AI-enabled version of Google, as it will not be able to keep up with the power of P2P AI matrix that will be enabled by millions of users running free P2P Semantic Web Inference Engine software on their home PCs.)

Thus, P2P Semantic Web Inference Engines will pose a huge and escalating threat to Google and other search engines and will expectedly do to them what P2P file sharing and BitTorrent did to FTP (central-server file transfer) and centralized file hosting in general (e.g. Amazon’s S3 use of BitTorrent.)

In other words, the coming of P2P Semantic Web Inference Engines, as an integral part of the still-emerging Web 3.0, will threaten to wipe out Google and other existing search engines. It’s hard to imagine how any one company could compete with 2 billion Web users (and counting), all of whom are potential users of the disruptive P2P model described here.

The Future

Currently, Semantic Web (aka Web 3.0) researchers are working out the technology and human resource issues and folks like Tim Berners-Lee, the Noble prize recipient and father of the Web, are battling critics and enlightening minds about the coming semantic web revolution.

In fact, the Semantic Web (aka Web 3.0) has already arrived, and Inference Engines are working with prototypical ontologies, but this effort is a massive one, which is why I was suggesting that its most likely enabler will be a social, collaborative movement such as Wikipedia, which has the human resources (in the form of the thousands of knowledgeable volunteers) to help create the ontologies (most likely as informal ontologies based on semantic annotations) that, when combined with inference rules for each domain of knowledge and the query structures for the particular schema, enable deductive reasoning at the machine level.

Addendum

On AI and Natural Language Processing

I believe that the first generation of AI that will be used by Web 3.0 (aka Semantic Web) will be based on relatively simple inference engines that will NOT attempt to perform natural language processing, where current approaches still face too many serious challenges. However, they will still have the formal deductive reasoning capabilities described earlier in this article, and users would interact with these systems through some query language.

Related

  1. Wikipedia 3.0: The End of Google?
  2. Intelligence (Not Content) is King in Web 3.0
  3. Get Your DBin
  4. All About Web 3.0

Tags:

Semantic Web, Web strandards, Trends, OWL, Googleinference engine, AI, ontologyWeb 2.0, Web 3.0AI, Wikipedia, Wikipedia 3.0, collective consciousness, Ontoworld, AI Engine, OWL-DL, Semantic MediaWiki, P2P 3.0