Intermittent 404 on and /tag functions

Page 1 of 1 (1 items)
This post has 0 Replies | 1 Follower

Posts 3
Michael Hollinger | Forum Activity | Posted: Mon, Feb 24 2014 11:58 AM

I am writing a script that will examine text for biblical references, and then harvest them in a database. I'm finding that in most cases, the [SCAN]( api function gives me what I need - but in several instances, it is giving me a 404.

I've ruled out the length - I'm chunking the calls into 1800 characters at a time

I've ruled out the server, because it is consistently either accepting or rejecting the same url calls.

I've ruled out passing text that contains no references.

Why does the API function return 404 ever?

{'cookies': <<class 'requests.cookies.RequestsCookieJar'>[]>, '_content': '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "">\r\n<html xmlns="">\r\n<head>\r\n<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"/>\r\n<title>404 - File or directory not found.</title>\r\n<style type="text/css">\r\n<!--\r\nbody{margin:0;font-size:.7em;font-family:Verdana, Arial, Helvetica, sans-serif;background:#EEEEEE;}\r\nfieldset{padding:0 15px 10px 15px;} \r\nh1{font-size:2.4em;margin:0;color:#FFF;}\r\nh2{font-size:1.7em;margin:0;color:#CC0000;} \r\nh3{font-size:1.2em;margin:10px 0 0 0;color:#000000;} \r\n#header{width:96%;margin:0 0 0 0;padding:6px 2% 6px 2%;font-family:"trebuchet MS", Verdana, sans-serif;color:#FFF;\r\nbackground-color:#555555;}\r\n#content{margin:0 0 0 2%;position:relative;}\r\n.content-container{background:#FFF;width:96%;margin-top:8px;padding:10px;position:relative;}\r\n-->\r\n</style>\r\n</head>\r\n<body>\r\n<div id="header"><h1>Server Error</h1></div>\r\n<div id="content">\r\n <div class="content-container"><fieldset>\r\n  <h2>404 - File or directory not found.</h2>\r\n  <h3>The resource you are looking for might have been removed, had its name changed, or is temporarily unavailable.</h3>\r\n </fieldset></div>\r\n</div>\r\n</body>\r\n</html>\r\n', 'headers': CaseInsensitiveDict({'content-length': '741', 'content-encoding': 'gzip', 'vary': 'Accept-Encoding', 'connection': 'Keep-Alive', 'date': 'Mon, 24 Feb 2014 19:47:47 GMT', 'access-control-allow-origin': '*', 'access-control-allow-headers': 'X-Requested-With', 'content-type': 'text/html'}), 'url': u'', 'status_code': 404, '_content_consumed': True, 'encoding': 'ISO-8859-1', 'request': <PreparedRequest [GET]>, 'connection': <requests.adapters.HTTPAdapter object at 0x10136b990>, 'elapsed': datetime.timedelta(0, 0, 162286), 'raw': <requests.packages.urllib3.response.HTTPResponse object at 0x10136bf10>, 'reason': 'Not Found', 'history': []}

Page 1 of 1 (1 items) | RSS