after President xi jinping and Brazilian President dilma? Under the witness rousseff, baidu launched the portuguese-speaking search foray into the Brazilian market, it is baidu the Thai search, the search after another Arab Chinese language search. A few days later, baidu released more than expected in 2014 Q2 earnings, double positive caused share prices soared, market capitalisation of nearly 80 billion.
a close to the Portuguese search baidu siege lion told me that we use is “coyotes” style, baidu from data collection to effect validation, now only three to four weeks to launch a new kind of search engine, Arabic, Thai, and Portuguese, baidu can be quickly and with very few resources to introduce more foreign search or minority language search.
the launch one new kind of search, relies on its formerly precipitation technology combining “coyotes” style, baidu will inevitably will win more and more “new world”, and it also may help baidu faster advance value “billions of club”.
new “coyotes” playing the Rank behind technical
search engine is very complex system, but the process can be a word clear: WEB data, access to structured data, build the index; Understand the needs of the users to search and query index find a pile as a result, the sort and output.
the data processing and user needs to understand by NLP (natural language processing). After more than ten years evolution baidu NLP techniques have been in the international leading level. While baidu WD team to access structured data, organize entire network resources. There is also a key link and is closest to the user are also very important: RANK.
RANK ordering is the most important branch of computer algorithm, is also the most important aspect of search engine. Early search engines compete the result set big recall rate (), fast response time. After human beings enter the era of information overload, in particular the rise of mobile search is the most important considerations make accuracy of search results. “Quasi”, that is, the correlation between results and user requirements. NLP understanding user needs, WD to prepare data, decided to the results of correlation is RANK strategy.
the whole system of search engine technology like a football team, on many need defender, midfielder and striker, if NLP/WD department is defenders and midfielders, then RANK is a striker. A football match victory, of course, need defenders strong organization and coordination of the defence and midfield dexterous, but to finally win the game, need most is forward of the charge.
as the support of hundreds of millions of users search engine, baidu RANK strategy is very complex and intelligent. But to understand it is not difficult to RANK. By means of mass parameters and the relevance of the result of the different weight scores, scores high in front of the row. Parameter is the text similarity, semantic relevance, user characteristics, search history, and even the user’s location. All in all, RANK to do is to make the front closest to the user when the result of the search needs.
fast and exact RANK into an
baidu RANK division in order to better support the internationalization, reconstructs the depth of the original architecture, implementation of the new kind of plug and play. Can do this with baidu RANK about new technology. RANK algorithm itself is intelligent, the internationalization of the RANK division using internal code for LTR machine learning techniques, the sorting model, including the training sample, sample data, this algorithm and tuning, has carried on the deep transformation. Sorting model evolution itself is geared to the needs of different languages.
in Chinese, baidu RANK can according to different user, different locations, different time feedback personalization features such as the result of the sorting, scene oriented RANK technology allows each user can find what you want in different nodes. Such as user search “horse” at ordinary times, most likely to want to buy the horse boat ticket, a year ago about flights and discount information, for “horse”, users are less likely to buy the ticket but watch the news. But these examples can not be exhaustive, massive search needs corresponding massive scene, so baidu engineers could not exhaustive algorithm optimized every scene, only to RANK model self-learning, smart.
in fact, baidu RANK system intelligent system of deep learning, after entering a new kind of search, engineers to relevant training corpus, the relevant label mark good, on the model of RANK correlation automatically after training effect is very good, with fully considering the internationalization the RANK structure support, baidu Chinese search some specific content has been integrated into the latest version of the Portuguese search: direct display form, sorting and filtering the results page controls close search, directly in the play video direct search results page. Different user, different time, different scenarios to search results are not necessarily the same, as needed.
let RANK under different languages intelligent theory is feasible. Baidu’s chief scientist Wu En who lead the team to use deep learning technology, automatic identification cat, to let the machine learning with a depth of understanding at a new language nature, the baidu RANK team without pt, Thai language experts, such as the corresponding search relevance do particularly well, and RANK can have effect on the successful application of deep learning sorting technology.
baidu Google hyenas tactical fighting overseas
since li proposed that the sex Wolf, at the end of 2012 in 2013, baidu has already reflected in the various movements Wolf sex culture, and in the overseas market, baidu is about to do a coyote to rob the Google market share. Hyenas tactics by huawei. The internationalization of huawei adopts “the rural areas to encircle the cities”, after the first difficult, into Hong Kong first, and then is Russia. In 1997, huawei into the African market, followed by Latin America and southeast Asia, the last is the heart of Europe and the United States, as China’s liberation diplomatic tactics. Now, baidu is the first Arab, Thai, and Portuguese, to continue in the future from the “rural” surrounded by Google hinterland markets such as Europe and the United States.
in baidu and Google has confrontation in China five years ago. Two search engines have different ideas, baidu put more emphasis on structured data, have Aladdin plan, at the same time to strengthen the UGC channels such as know, wikipedia, and post bar, Google too much faith and rely on technology to its contempt for data and operations. Result is baidu search results page content more rich diversity, have a direct, intimate, wikipedia results. Now it is the right baidu search results page knowledge map application, combined with user requirements and the results of data mining related knowledge and display, from all kinds of CASE to see baidu knowledge map effect is stronger than Google. Suggesting that baidu RANK, WD, NLP techniques such as team has teamed up with successful sniper Google, even if Google out of China, not even Google to return to China, still dozen however baidu.
baidu and Google again in overseas markets, the new baidu in a similar way to shoot them one by one. Such as knowledge map, for example, Brazil baidu cooperation with a large number of third party access structured data on the one hand, on the other hand on the vertical category has carried out a large number of entities in mining work, cleaning and merge, so in just half a year time then launched dozens of vertical category, accumulated tens of millions of entity data, so in the real coverage will Google far behind. Because Google facing the global market, top-down, comprehensive coverage. Spread too open only on the strategic war, baidu is then built, is guerrilla warfare hyenas. In Google tend to general technical scheme at low cost, high efficiency, baidu has launched in each local market localization and intensive cultivation. Especially on the operational data operation has been Google’s weaknesses, baidu is good at.
even if Google has first-mover advantage, baidu still can be breached by the overseas market with vertical and local strategy, finally to the English market against Google is a big probability event, when the two battle over the search giant will be more beautiful.
the author weibo @ Internet super, WeChat SuperSofter