从网站获取请求并检索响应?

我正在尝试监控一个网站( www.bidcactus.com )。 在网站上我打开Firebug,转到网络选项卡,然后单击XHR选项卡。

我想获取请求的响应并将其保存到mySql数据库(我的计算机上运行了一个本地数据库(XAMPP)。

我被告知主要使用jQuery或JavaScript做各种各样的事情,但我也没有经验,所以我想知道是否有人可以帮助我在这里。

有人向我建议这个链接使用Greasemonkey和jQuery拦截页面中的JSON / AJAX数据,并处理它

它使用Greasemonkey以及我不太了解…

在此先感谢您的帮助

示例/更多细节:
在监控发送的请求时(通过firebug),我在下面看到

http://www.bidcactus.com/CactusWeb/ItemUpdates?rnd=1310684278585 The response of this link is the following: {"s":"uk5c","a":[{"w":"MATADORA","t":944,"p":5,"a":413173,"x":10}, {"w":"1000BidsAintEnough","t":6,"p":863,"a":413198,"x":0}, {"w":"YourBidzWillBeWastedHere","t":4725,"p":21,"a":413200,"x":8}, {"w":"iwillpay2much","t":344,"p":9,"a":413201,"x":9}, {"w":"apcyclops84","t":884,"p":3,"a":413213,"x":14}, {"w":"goin_postal","t":165,"p":5,"a":413215,"x":12}, {"w":"487951","t":825,"p":10,"a":413218,"x":6}, {"w":"mishmash","t":3225,"p":3,"a":413222,"x":7}, {"w":"CrazyKatLady2","t":6464,"p":1,"a":413224,"x":2}, {"w":"BOSS1","t":224,"p":102,"a":413230,"x":4}, {"w":"serbian48","t":62,"p":2,"a":413232,"x":11}, {"w":"Tuffenough","t":1785,"p":1,"a":413234,"x":1}, {"w":"apcyclops84","t":1970,"p":1,"a":413240,"x":13}, {"w":"Tuffenough","t":3524,"p":1,"a":413244,"x":5}, {"w":"Cdm17517","t":1424,"p":1,"a":413252,"x":3}],"tau":"0"} 

我理解这些信息,我认为我可以自己格式化,但网站随机创建新请求。
示例http://www.bidcactus.com/CactusWeb/ItemUpdates?rnd=XXXXXXXXXXXX
而且我不确定它是如何创造它们的。

因此,我需要获取所有项目更新请求的响应,并将信息发送到mysql数据库。

好的,这是工作代码,有点针对该网站(仅限首页,没有帐户)。

使用说明:

  1. 安装GM脚本。 请注意,目前它仅适用于Firefox。

  2. 观察它在Firebug的控制台中运行,并调整filter部分(清楚标记),以定位您感兴趣的数据。(也许整个数组?)

    请注意,打印“脚本开始”后可能需要几秒钟,因为ajax拦截开始。

  3. 设置Web应用程序和服务器以接收数据。 该脚本发布了JSON,因此PHP会抓取数据,如下所示:

     $jsonData = json_decode ($HTTP_RAW_POST_DATA); 
  4. 将脚本指向您的服务器。

  5. 瞧。 她完成了。


 /****************************************************************************** ******************************************************************************* ** This script intercepts ajaxed data from the target web pages. ** There are 4 main phases: ** 1) Intercept XMLHttpRequest's made by the target page. ** 2) Filter the data to the items of interest. ** 3) Transfer the data from the page-scope to the GM scope. ** NOTE: This makes it technically possibly for the target page's ** webmaster to hack into GM's slightly elevated scope and ** exploit any XSS or zero-day vulnerabilities, etc. The risk ** is probably zero as long as you don't start any feuds. ** 4) Use GM_xmlhttpRequest () to send the data to our server. ******************************************************************************* ******************************************************************************* */ // ==UserScript== // @name _Record ajax, JSON data. // @namespace stackoverflow.com/users/331508/ // @description Intercepts Ajax data, filters it and then sends it to our server. // @include http://www.bidcactus.com/* // ==/UserScript== DEBUG = true; if (DEBUG) console.log ('***** Script Start *****'); /****************************************************************************** ******************************************************************************* ** PHASE 1 starts here, this is the XMLHttpRequest intercept code. ** Note that it will not work in GM's scope. We must inject the code to the ** page scope. ******************************************************************************* ******************************************************************************* */ funkyFunc = ( (<> 1) { /*--- For demonstration purposes, we will only get the 2nd row in the `a` array. (Probably stands for "auction".) */ payloadArray.push (jsonObj.a[1]); if (DEBUG) console.log (jsonObj.a[1]); } //--- Done at this stage! Rest is up to the GM scope. } }, false); open.call (this, method, url, async, user, pass); }; } ) (XMLHttpRequest.prototype.open); ]]>).toString () ); function addJS_Node (text, s_URL) { var scriptNode = document.createElement ('script'); scriptNode.type = "text/javascript"; if (text) scriptNode.textContent = text; if (s_URL) scriptNode.src = s_URL; var targ = document.getElementsByTagName('head')[0] || d.body || d.documentElement; targ.appendChild (scriptNode); } addJS_Node (funkyFunc); /****************************************************************************** ******************************************************************************* ** PHASE 3b: ** Set up a timer to check for data from our ajax intercept. ** Probably best to make it slightly faster than the target's ** ajax frequency (about 1 second?). ******************************************************************************* ******************************************************************************* */ timerHandle = setInterval (function() { SendAnyResultsToServer (); }, 888); function SendAnyResultsToServer () { if (unsafeWindow.payloadArray) { var payload = unsafeWindow.payloadArray; while (payload.length) { var dataRow = JSON.stringify (payload[0]); payload.shift (); //--- pop measurement off the bottom of the stack. if (DEBUG) console.log ('GM script, pre Ajax: ', dataRow); /****************************************************************************** ******************************************************************************* ** PHASE 4: Send the data, one row at a time, to the our server. ** The server would grab the data with: ** $jsonData = json_decode ($HTTP_RAW_POST_DATA); ******************************************************************************* ******************************************************************************* */ GM_xmlhttpRequest ( { method: "POST", url: "http://localhost/db_test/ShowJSON_PostedData.php", data: dataRow, headers: {"Content-Type": "application/json"}, onload: function (response) { if (DEBUG) console.log (response.responseText); } } ); } } } //--- EOF 


杂项说明:

  1. 我在该网站的主页上测试了它,没有登录(我不打算在那里创建一个帐户)。

  2. 我使用AdBlockFlashBlockNoSCriptRequestPolicy进行了全面测试。 对于bidcactus.com (它必须是),JS被打开了,但没有其他人。 重新开启所有这些问题不应该导致副作用 – 但如果确实如此,我就不会调试它。

  3. 这样的代码必须针对网站进行调整,以及如何浏览网站1 。 这取决于你做到这一点。 希望代码足够自我记录。

  4. 请享用!


1主要是: @include@exclude指令,JSON数据选择和过滤,以及是否需要阻止iFrame。 此外,建议在完成调整时将2个DEBUG变量(一个用于GM范围,一个用于页面范围)设置为false

由于相同的原始策略 ,使用javascript / jquery的ajax请求无法实现这一点

我没有经历过油脂,通过