浅测 长亭雷池 WAF “动态防护”

Anye
Anye
发布于 2024-06-03 / 263 阅读
1
2

浅测 长亭雷池 WAF “动态防护”

前言

雷池 WAF 社区版的更新速度是真快啊,几乎一周一个小版本,俩月一个大版本,攻城狮们真的狠啊,没法测了。

废话不多说,前两天看到了 这篇文章,对雷池的“动态防护”功能挺感兴趣,特地来试试。

安装部署

本文以测评为主,不再阐述部署过程,介绍一下我这里的测试环境:

VM1:1Panel 部署 OpenResty,部署项目 Anyeの导航 ,IP(192.168.0.220)

VM2:部署雷池 WAF 社区版,添加站点,开启“动态防护”,IP(192.168.0.225)

测试

扒取页面

通常,我会采用这种方式来复刻一个主题,最常用的就是直接从浏览器开发人员工具中扒取出页面的 html,css,js 等文件,来重制主题。

开启了雷池动态防护的页面,会有一个解密的过程,其实也就是 js 执行的过程

HTML

这个过程极大的延长了页面的加载时间,大致是 3s 左右。

页面打开后,对于元素发现页面构建相同,代表页面并没有因为加密而产生变形

可见页面已加密,不过加密也导致 索引 页面严重增大🤣,看看后期有没有希望继续优化。

JS

加密了 js 文件尝试了一下,每次返回的js加密结果都不相同。

很明显是进行了混淆,不过经过文本对比后发现了端倪。

这里贴出完整 js 代码

// 源js文件
/*! * Lazy Load - JavaScript plugin for lazy loading images * * Copyright (c) 2007-2017 Mika Tuupola * * Licensed under the MIT license: *   http://www.opensource.org/licenses/mit-license.php * * Project home: *   https://appelsiini.net/projects/lazyload * * Version: 2.0.0-beta.2 * */
(function(root, factory) {
    if (typeof exports === "object") {
        module.exports = factory(root);
    } else if (typeof define === "function" && define.amd) {
        define([], factory(root));
    } else {
        root.LazyLoad = factory(root);
    }
}
)(typeof global !== "undefined" ? global : this.window || this.global, function(root) {
    "use strict";
    const defaults = {
        src: "data-src",
        srcset: "data-srcset",
        selector: ".lazyload"
    };
    /** * Merge two or more objects. Returns a new object. * @private * @param {Boolean}  deep     If true, do a deep (or recursive) merge [optional] * @param {Object}   objects  The objects to merge together * @returns {Object}          Merged values of defaults and options */
    const extend = function() {
        let extended = {};
        let deep = false;
        let i = 0;
        let length = arguments.length;
        /* Check if a deep merge */
        if (Object.prototype.toString.call(arguments[0]) === "[object Boolean]") {
            deep = arguments[0];
            i++;
        }
        /* Merge the object into the extended object */
        let merge = function(obj) {
            for (let prop in obj) {
                if (Object.prototype.hasOwnProperty.call(obj, prop)) {
                    /* If deep merge and property is an object, merge properties */
                    if (deep && Object.prototype.toString.call(obj[prop]) === "[object Object]") {
                        extended[prop] = extend(true, extended[prop], obj[prop]);
                    } else {
                        extended[prop] = obj[prop];
                    }
                }
            }
        };
        /* Loop through each object and conduct a merge */
        for (; i < length; i++) {
            let obj = arguments[i];
            merge(obj);
        }
        return extended;
    };
    function LazyLoad(images, options) {
        this.settings = extend(defaults, options || {});
        this.images = images || document.querySelectorAll(this.settings.selector);
        this.observer = null;
        this.init();
    }
    LazyLoad.prototype = {
        init: function() {
            /* Without observers load everything and bail out early. */
            if (!root.IntersectionObserver) {
                this.loadImages();
                return;
            }
            let self = this;
            let observerConfig = {
                root: null,
                rootMargin: "0px",
                threshold: [0]
            };
            this.observer = new IntersectionObserver(function(entries) {
                entries.forEach(function(entry) {
                    if (entry.intersectionRatio > 0) {
                        self.observer.unobserve(entry.target);
                        self.loadImage(entry.target);
                    }
                });
            }
            ,observerConfig);
            this.images.forEach(function(image) {
                self.observer.observe(image);
            });
        },
        loadAndDestroy: function() {
            if (!this.settings) {
                return;
            }
            this.loadImages();
            this.destroy();
        },
        loadImage: function(image) {
            image.onerror = function() {
                image.onerror = null;
                image.src = image.srcset = image.dataset.original;
            }
            ;
            let src = image.getAttribute(this.settings.src);
            let srcset = image.getAttribute(this.settings.srcset);
            if ("img" === image.tagName.toLowerCase()) {
                if (src) {
                    image.dataset.original = image.src;
                    image.src = src;
                }
                if (srcset) {
                    image.srcset = srcset;
                }
            } else {
                image.style.backgroundImage = "url(" + src + ")";
            }
        },
        loadImages: function() {
            if (!this.settings) {
                return;
            }
            let self = this;
            this.images.forEach(function(image) {
                self.loadImage(image);
            });
        },
        destroy: function() {
            if (!this.settings) {
                return;
            }
            this.observer.disconnect();
            this.settings = null;
        }
    };
    root.lazyload = function(images, options) {
        return new LazyLoad(images,options);
    }
    ;
    if (root.jQuery) {
        const $ = root.jQuery;
        $.fn.lazyload = function(options) {
            options = options || {};
            options.attribute = options.attribute || "data-src";
            new LazyLoad($.makeArray(this),options);
            return this;
        }
        ;
    }
    return LazyLoad;
});
// 动态防护加密后的js文件
function vgo8rYXzpS() {
    var YIhUo91Nlh = 99.6174697329428;
    while (YIhUo91Nlh < 6) {
        YIhUo91Nlh++
    }
    var kJsBQ2iTCw = 77.7991427720637;
    while (kJsBQ2iTCw < 8) {
        kJsBQ2iTCw++
    }
    var Uv8SujYUUJ = 54.122410119766634;
    62.94717341414315 + 14.215159769026501;
    "eCDkWHqKcu";
    20.29250300507593 + 96.90578776550426;
    var hKDl2Z6IyR = 2.1154780179250436;
    while (hKDl2Z6IyR < 9) {
        hKDl2Z6IyR++
    }
    var jJJdYPyWC8 = 96.35369160356686;
    while (jJJdYPyWC8 < 10) {
        jJJdYPyWC8++
    }
    var q1lUq8lALI = 79.3826780702858;
    var KSm4kSmK5Q = 16.811363665066132;
    while (KSm4kSmK5Q < 5) {
        KSm4kSmK5Q++
    }
    while (q1lUq8lALI < 5) {
        q1lUq8lALI++
    }
    var k8jxtioSu1 = 46.12863667479478;
    if (k8jxtioSu1 < 50)
        VdgkMuAloP("dbKMKN3DiD");
    else
        VdgkMuAloP("Z_GUlDIf7g");
    32.61116098968565 + 39.92340222133316;
    var WOIqRFoBWI = 35.570788142150256;
    while (WOIqRFoBWI < 10) {
        WOIqRFoBWI++
    }
    var REP52ajkkB = 68.57029249635578;
    while (REP52ajkkB < 9) {
        REP52ajkkB++
    }
    var UvGT8ugsmm = 77.45257249038768;
    var c_XLMPoMhw = 70.0508383263844;
    var oXNng_nyI3 = 61.714023740614785;
    "f3dzUmlSrt";
    while (oXNng_nyI3 < 8) {
        oXNng_nyI3++
    }
    "oiou9de1Yg";
    "jdpOma9ApF";
    var NeReO5OH2M = 63.89278655453103;
    while (NeReO5OH2M < 8) {
        NeReO5OH2M++
    }
    var _p_ydR_UZY = 83.88263735619535;
    var F85mcn2g_m = 17.165604886412726;
    while (F85mcn2g_m < 10) {
        F85mcn2g_m++
    }
    24.428701219017995 + 36.33105120927406;
    var E_btPRjrmk = 95.02151619364821;
    var No5m6438qj = 4.1049208686863246;
    while (No5m6438qj < 9) {
        No5m6438qj++
    }
    while (E_btPRjrmk < 6) {
        E_btPRjrmk++
    }
    var NW66eJHW18 = 80.32092123501981;
    if (NW66eJHW18 < 50)
        VdgkMuAloP("wVsjIS9XQo");
    else
        VdgkMuAloP("GoLW5hTcVj");
    43.682142399473555 + 22.477837399452866;
    var UNuZiogsXq = 70.37483244640134;
    while (UNuZiogsXq < 6) {
        UNuZiogsXq++
    }
    var JyjoRvjvV4 = 60.33084553561216;
    "PPLO3pqrCR";
    function VdgkMuAloP() {
        "CCutBlYuiL";
        "MgmL5Sv_33";
        19.82880014765916 + 66.83450544153038;
        var eH8bW0LeRO = 91.84505170089825;
        var GGEb99P0LW = 92.75726773787632;
        "eM_DBnNLNQ";
        "LTExkL39fU";
        var ayFeMZ7J9o = 4.08422739984733;
        var rvzNYoM37B = 42.468405912837106;
        while (rvzNYoM37B < 6) {
            rvzNYoM37B++
        }
        "YLovxab17O";
        var dpUNCcw57i = 4.9146145098517575;
        while (dpUNCcw57i < 10) {
            dpUNCcw57i++
        }
        var ZjzssshCHy = 22.711581319339555;
        "lsyj2Pu6bi";
        "mAdio22F97";
        95.9152148251555 + 18.563789346616783;
        "Kisq7F_TOW";
        var EO9rGZSTTK = 53.3184198670574;
        while (EO9rGZSTTK < 6) {
            EO9rGZSTTK++
        }
        var Lfrg2SayBj = 96.40296951052316;
        var SR4gkdmFPm = 24.691037844119176;
        while (SR4gkdmFPm < 7) {
            SR4gkdmFPm++
        }
        "njAS_NShim";
        "DIoi_JwNCk";
        var qZALlgtAos = 65.84687374547939;
        while (qZALlgtAos < 5) {
            qZALlgtAos++
        }
        "i8UnwEQqP2";
        var mkzN8inJtT = 89.67717243925355;
        "EXgVlnAkaM";
        var HGgVbs9bD5 = 56.50704313244045;
        var myQFrz2kY4 = 54.55344568694437;
        while (myQFrz2kY4 < 8) {
            myQFrz2kY4++
        }
        var VcC388Sonl = 78.22901590625897;
        while (VcC388Sonl < 5) {
            VcC388Sonl++
        }
        var EDQM5T4i5x = 58.12080899105871;
        while (EDQM5T4i5x < 8) {
            EDQM5T4i5x++
        }
        var lS3HRC8N0e = 42.29800294748819;
        while (lS3HRC8N0e < 5) {
            lS3HRC8N0e++
        }
    }
}
(function(that, a) {
    var checkF = new RegExp("\\w *\\(\\){.+}");
    "Saz5menkPn";
    var checkR = new RegExp("(\\[x|u](\\w){2,4})+");
    var checkFunction = function checkFunction1() {
        if (checkR.test(checkFunction.toString())) {
            f2([2, 15, 12])
        }
        ;return '\x63\x68\x65\x63\x6b\x46\x75\x6e\x63\x74\x69\x6f\x6e'
    };
    var f1 = function f1(a) {
        a.push[a];
        f2(a)
    };
    var f2 = function f2(a) {
        a.push[a];
        f1(a)
    };
    if (!checkF.test(checkFunction.toString())) {
        f2([])
    } else if (checkR.test(checkFunction.toString())) {
        f2([1, 3, 7])
    }
    ;"ISXAG9bapu";
    var KOaO2Jk15j = 13.366279497772231;
    KOaO2Jk15j = 65.44109390187671;
    return a(that);
    function f5OTtZ1pUr() {
        "PFAZrkUkjJ";
        "yohosCBZku";
        "czcc_QG98P";
        var Y7ZMWHKbB5 = 54.61165307973092;
        while (Y7ZMWHKbB5 < 7) {
            Y7ZMWHKbB5++
        }
        "RAAE3i3HNJ";
        var yYJU5WMbNs = 81.78463513401672;
        var Dsj0YbE3nh = 60.12962678573978;
        while (Dsj0YbE3nh < 8) {
            Dsj0YbE3nh++
        }
        "OrNvH3Vm8U";
        var FcVeaK_8CJ = 70.65213865609662;
        var V10P1fXl1e = 93.28700416475893;
        while (V10P1fXl1e < 7) {
            V10P1fXl1e++
        }
        var _jQaUeEOlz = 50.09958863458343;
        while (_jQaUeEOlz < 6) {
            _jQaUeEOlz++
        }
        "mkyDT6LuXp";
        "i_d0Jej01W";
        93.1178573977863 + 65.0171586053574;
        "MdcdXdZD8e";
        var FvwOTr68cW = 63.96686898919228;
        while (FvwOTr68cW < 5) {
            FvwOTr68cW++
        }
        12.007514883700402 + 67.83201664204582;
        var lVk87OnDY0 = 10.352772574019035;
        while (lVk87OnDY0 < 10) {
            lVk87OnDY0++
        }
        "ZIxYt7RDz5";
        var HOjmKYxZYn = 73.07394273998264;
        var xf5LYnXM_h = 34.42670716048105;
        while (xf5LYnXM_h < 7) {
            xf5LYnXM_h++
        }
        25.31980979829108 + 70.92299314386324;
        53.64987099085665 + 15.95767193794893;
        81.04615728361688 + 53.03190420900158;
        "xMyhHs6tqa";
        var CicbLkYxKL = 71.43209809856342;
        while (CicbLkYxKL < 9) {
            CicbLkYxKL++
        }
        81.82768180662697 + 44.54696909044475;
        "a7S1hc6l6e";
        40.02457556515699 + 50.13884740950273
    }
}
)(this, function(that) {
    var EfanXdEsAo = 45.044183852209066;
    var YiD8rJkjM4 = 60.40560906974519;
    while (YiD8rJkjM4 < 9) {
        YiD8rJkjM4++
    }
    var SFXCSJnYT5 = 5.6590674829357;
    function DROWk3baLH() {
        var zJ2eGCAqG6 = 88.73517894514487;
        while (zJ2eGCAqG6 < 7) {
            zJ2eGCAqG6++
        }
        "fVZRgjskF7";
        var imVKfLWElR = 98.4392536479853;
        var JRZm9ZRXt9 = 65.52198475056669;
        "B07bldh2rt";
        94.61531928891331 + 21.79165407508193;
        "H7umcmyF_g";
        var kUwyNiUzjX = 48.5975080540464;
        while (kUwyNiUzjX < 5) {
            kUwyNiUzjX++
        }
        var G5q4i5ptYT = 17.20767169078439;
        while (G5q4i5ptYT < 9) {
            G5q4i5ptYT++
        }
        "S0RQAJX7ZD";
        var sbwEsBL3on = 31.621188048769;
        var Rbqnn2M5lo = 1.6814855430946412;
        var wXdYUzqLKS = 53.13735728383625;
        var iIMDA_Qowp = 87.59602310423611;
        55.899852486295956 + 23.463153124052145;
        49.64055650210554 + 21.699124979927305
    }
});
(function(root, factory) {
    if (typeof exports === "object") {
        module.exports = factory(root)
    } else if (typeof define === "function" && define.amd) {
        define([], factory(root))
    } else {
        root.LazyLoad = factory(root)
    }
    20.075305793669145 + 96.72088665263502;
    function mCbgSrID5z() {
        var ymZDdeQQne = 84.05118287547904;
        while (ymZDdeQQne < 8) {
            ymZDdeQQne++
        }
        "oeSlUltLf7";
        var vRcSMA7HZy = 95.63940228416716;
        while (vRcSMA7HZy < 5) {
            vRcSMA7HZy++
        }
        42.847416356461686 + 64.228599747433;
        var S4X325xZd0 = 36.8477478357082;
        var OjdsnE6IgU = 54.6819719737484;
        while (OjdsnE6IgU < 7) {
            OjdsnE6IgU++
        }
        var tsdIhKV6Tu = 21.849203694513204;
        "IBEZHnHB9P";
        var W1xUDNOclb = 97.45176245010938;
        var qBhKSMePxI = 53.918246237085604;
        while (qBhKSMePxI < 8) {
            qBhKSMePxI++
        }
        var tG5h3fCZpA = 8.9527278684316;
        1.8795339533222326 + 85.30147367116075;
        3.4838274267666733 + 52.70631782675951;
        var sglVSvKjZv = 13.453736652916252;
        while (sglVSvKjZv < 9) {
            sglVSvKjZv++
        }
        var bzJ8IfE03K = 72.4963090140686;
        var IzYpLOgN6D = 1.7126081902147487;
        86.90125410102027 + 60.096220564929666;
        55.32420449194843 + 93.21714769547813;
        15.44123941754805 + 88.74042551968007;
        "ayjAX7QOFR";
        "nsviM21tO7";
        "RGvq8LBnOO";
        2.5365166268296333 + 58.41895276641477;
        var NwT9TZgChj = 3.2736264569624316;
        "ZVRMQfyCrJ";
        "fwutOlKiEI";
        var Ejy2yBkKAt = 51.83509013431559;
        while (Ejy2yBkKAt < 7) {
            Ejy2yBkKAt++
        }
        92.07691206149254 + 13.437580090223227;
        "qdd7jYm20k";
        var j2KtWorODN = 42.14264503067741;
        2.478519996620122 + 58.627727544483704
    }
}
)(typeof global !== "undefined" ? global : this.window || this.global, function(root) {
    "use strict";
    var defaults = {
        src: "data-src",
        srcset: "data-srcset",
        selector: ".lazyload"
    };
    var extend = function extend1() {
        var extended = {};
        var deep = false;
        var i = 0;
        var length = arguments.length;
        if (Object.prototype.toString.call(arguments[0]) === "[object Boolean]") {
            deep = arguments[0];
            i++
        }
        var merge = function merge(obj) {
            for (var prop in obj) {
                if (Object.prototype.hasOwnProperty.call(obj, prop)) {
                    if (deep && Object.prototype.toString.call(obj[prop]) === "[object Object]") {
                        extended[prop] = extend(true, extended[prop], obj[prop])
                    } else {
                        extended[prop] = obj[prop]
                    }
                }
            }
        };
        for (; i < length; i++) {
            var obj = arguments[i];
            merge(obj)
        }
        return extended
    };
    function LazyLoad(images, options) {
        this.settings = extend(defaults, options || {});
        this.images = images || document.querySelectorAll(this.settings.selector);
        this.observer = null;
        this.init()
    }
    LazyLoad.prototype = {
        init: function init() {
            if (!root.IntersectionObserver) {
                this.loadImages();
                return
            }
            var self = this;
            var observerConfig = {
                root: null,
                rootMargin: "0px",
                threshold: [0]
            };
            this.observer = new IntersectionObserver(function(entries) {
                entries.forEach(function(entry) {
                    if (entry.intersectionRatio > 0) {
                        self.observer.unobserve(entry.target);
                        self.loadImage(entry.target)
                    }
                })
            }
            ,observerConfig);
            this.images.forEach(function(image) {
                self.observer.observe(image)
            })
        },
        loadAndDestroy: function loadAndDestroy() {
            if (!this.settings) {
                return
            }
            this.loadImages();
            this.destroy()
        },
        loadImage: function loadImage(image) {
            image.onerror = function() {
                image.onerror = null;
                image.src = image.srcset = image.dataset.original
            }
            ;
            var src = image.getAttribute(this.settings.src);
            var srcset = image.getAttribute(this.settings.srcset);
            if ("img" === image.tagName.toLowerCase()) {
                if (src) {
                    image.dataset.original = image.src;
                    image.src = src
                }
                if (srcset) {
                    image.srcset = srcset
                }
            } else {
                image.style.backgroundImage = "url(" + src + ")"
            }
        },
        loadImages: function loadImages() {
            if (!this.settings) {
                return
            }
            var self = this;
            this.images.forEach(function(image) {
                self.loadImage(image)
            })
        },
        destroy: function destroy() {
            if (!this.settings) {
                return
            }
            this.observer.disconnect();
            this.settings = null
        }
    };
    root.lazyload = function(images, options) {
        return new LazyLoad(images,options)
    }
    ;
    if (root.jQuery) {
        var $ = root.jQuery;
        $.fn.lazyload = function(options) {
            options = options || {};
            options.attribute = options.attribute || "data-src";
            new LazyLoad($.makeArray(this),options);
            return this
        }
    }
    98.21293314139757 + 7.022202427695869;
    return LazyLoad;
    var zb5YrDEzX8 = 49.67145292566205;
    function EMQ2TiywJ2() {
        "kDDG4hcurX";
        "LmMQDl5Guf";
        "H1d2hSNdZu";
        "vR3uU0dztV";
        "BYg6Cwwew1";
        "Z7Cgb85The";
        var vqCn2LKHiQ = 79.8849526204871;
        while (vqCn2LKHiQ < 6) {
            vqCn2LKHiQ++
        }
        "HzKhtzFo0S";
        var Ruo3QF3HKv = 57.10873587557603;
        while (Ruo3QF3HKv < 8) {
            Ruo3QF3HKv++
        }
        var NIVPEUabT_ = 25.838978101412078;
        var pmvJIgXrA7 = 41.71629707156116;
        while (pmvJIgXrA7 < 7) {
            pmvJIgXrA7++
        }
        var cTqhNJmnqp = 7.679109504729966;
        var Xkldnu7eiS = 89.26080892492617;
        while (Xkldnu7eiS < 8) {
            Xkldnu7eiS++
        }
        var Iq6adjqOVj = 86.04734679658776;
        while (Iq6adjqOVj < 7) {
            Iq6adjqOVj++
        }
        var TyC7F0mXPj = 57.405394830228786;
        while (TyC7F0mXPj < 7) {
            TyC7F0mXPj++
        }
        var q7oS54FoCf = 19.715578974920984;
        6.354381716758419 + 48.514464467999424;
        var JPOOCo51Cg = 94.50513995137923;
        18.85838453981073 + 55.22787281970704;
        "rUww9Es4UQ"
    }
});

可见该加密方式采用了:

  • 变量名和函数名替换:将原有的变量名和函数名替换为难以理解的字符或字符串,例如vgo8rYXzpSYIhUo91Nlh等。

  • 字符串混淆:将代码中的字符串通过某些算法转换为难以阅读的形式。

  • 控制流改变:通过添加无意义的循环、条件判断等,改变代码的控制流,使得代码执行过程变得复杂。

  • 代码拆分:将代码拆分为多个部分,并通过某些机制动态地组合执行。

可见还是较为容易反混淆的,期待后期加强。

python 爬取

根据 官方文档 所述,动态防护还是为了可以更好地阻止爬虫和攻击自动化程序的分析,那么就尝试编写一段python代码来进行 HTML 内容爬取测试。

比如爬取 本站导航站,这里使用 Microsoft 的 playwright 库

import asyncio
from playwright.async_api import async_playwright

async def scrape_data():
    async with async_playwright() as p:
        browser = await p.chromium.launch(headless=True)
        page = await browser.new_page()

        # 加载页面
        await page.goto('http://192.168.0.220')

        # 等待页面加载完成
        await page.wait_for_load_state('networkidle')

        # 提取链接、图标和描述
        items = await page.query_selector_all('.list-item.block')
        data = []

        for item in items:
            link = await item.query_selector('a.list-content')
            if link:
                href = await link.get_attribute('href')
                title = await link.query_selector('.list-title')
                desc = await link.query_selector('.list-desc')
                img = await item.query_selector('img')

                data.append({
                    'link': href,
                    'title': await title.inner_text() if title else None,
                    'desc': await desc.inner_text() if desc else None,
                    'icon': await img.get_attribute('src') if img else None,
                })

        await browser.close()
        return data

# 运行
async def main():
    data = await scrape_data()
    for entry in data:
        print(entry)

asyncio.run(main())

源站执行爬取后输出:

雷池防护后:

效果明显,不过,感觉还是有绕过的空间呀?试试浏览器有头模式?

import asyncio
from playwright.async_api import async_playwright

async def scrape_data():
    async with async_playwright() as p:
        # 指定浏览器路径并启用有头模式
        browser = await p.chromium.launch(
            headless=False,  # 设置为 False 以显示浏览器窗口
            executable_path="C:\\Users\\Anye\\AppData\\Local\\Chromium\\Application\\chrome.exe"
        )
        page = await browser.new_page()

        # 加载本地服务器上的页面
        await page.goto('http://192.168.0.225/')

        # 手动处理人机验证
        print("请手动处理页面上的人机验证...")
        await page.wait_for_selector('.list-item.block', timeout=0)  # 等待页面加载完成,没有超时限制

        # 提取链接、图标和描述
        items = await page.query_selector_all('.list-item.block')
        data = []

        for item in items:
            link = await item.query_selector('a.list-content')
            if link:
                href = await link.get_attribute('href')
                title = await link.query_selector('.list-title')
                desc = await link.query_selector('.list-desc')
                img = await item.query_selector('img')

                data.append({
                    'link': href,
                    'title': await title.inner_text() if title else None,
                    'desc': await desc.inner_text() if desc else None,
                    'icon': await img.get_attribute('src') if img else None,
                })

        await browser.close()
        return data

# 运行
async def main():
    data = await scrape_data()
    for entry in data:
        print(entry)

asyncio.run(main())

结果:成功获取内容

既然这样,那岂不是等待解密后就可以获取内容了,那么等待5秒试试?

import asyncio
from playwright.async_api import async_playwright

async def scrape_data():
    async with async_playwright() as p:
        browser = await p.chromium.launch(headless=True)
        page = await browser.new_page()

        # 加载页面
        await page.goto('http://192.168.0.225')

        # 等待页面加载完成
        await page.wait_for_load_state('networkidle')

        # 等待5秒
        await page.wait_for_timeout(5000)

        # 提取链接、图标和描述
        items = await page.query_selector_all('.list-item.block')
        data = []

        for item in items:
            link = await item.query_selector('a.list-content')
            if link:
                href = await link.get_attribute('href')
                title = await link.query_selector('.list-title')
                desc = await link.query_selector('.list-desc')
                img = await item.query_selector('img')

                data.append({
                    'link': href,
                    'title': await title.inner_text() if title else None,
                    'desc': await desc.inner_text() if desc else None,
                    'icon': await img.get_attribute('src') if img else None,
                })

        await browser.close()
        return data

# 运行
async def main():
    data = await scrape_data()
    for entry in data:
        print(entry)

asyncio.run(main())

成功获取。

雷池开发别打我,HTML 确实不太好加密

不过目前来讲确实可以拦截大部分爬虫的爬取,正常爬虫不会长时间等待页面加载,也不会用有头模式。

测试++

经测试,开启 人机验证 后,是可以有效拦截爬虫获取内容。

不过还是希望雷池开发大大可以继续研究研究如何加强“动态防护”的算法😘。

后记

不让黑客,越雷池一步。

本次测试为内部测试环境,请勿用于黑客攻击行为。后期雷池也会加强加密算法,保护 WEB 安全。


评论