mirror of
https://github.com/rjhansen/nsrlsvr.git
synced 2026-01-08 22:08:08 -05:00
Initial import from Subversion sources.
This commit is contained in:
80
CHANGELOG
Normal file
80
CHANGELOG
Normal file
@@ -0,0 +1,80 @@
|
||||
1.1.1: February 22, 2013
|
||||
* Code cleanups. Now has better support for using custom datasets.
|
||||
Less dependence on scripts for building.
|
||||
|
||||
1.1: May 2, 2012
|
||||
* Supports version 2 of the wire protocol, which introduces new
|
||||
commands: STATUS (gives server status), BYE (what it says),
|
||||
UPSHIFT (attempt to negotiate to a more recent protocol) and
|
||||
DOWNSHIFT (negotiate to a lower protocol). Version 2 also
|
||||
supports multiple QUERY commands in a single connection, which
|
||||
helps a lot when fighting off port exhaustion.
|
||||
* Switched from blocking I/O to poll()-based I/O. This helps
|
||||
deal with the out-of-control system loads that some users
|
||||
were seeing.
|
||||
* Uses RDS 2.36
|
||||
|
||||
1.0.6: January 20, 2012
|
||||
* Discovered that Win32 I/O redirection didn't work at all.
|
||||
Whoops. This got fixed.
|
||||
|
||||
1.0.5: January 17, 2012
|
||||
* 1.0.4 added a bad regex that didn't match as much as it
|
||||
should have. This had the effects of stripping SHA-1 hashes
|
||||
down to 128 bits. Whoops.
|
||||
* Now compiles on FreeBSD 9.0.
|
||||
* nsrlparse became nsrlparse.py
|
||||
* nsrllookup became nsrllookup.py
|
||||
* Fixed documentation to reflect these name changes
|
||||
|
||||
1.0.4: January 3, 2012
|
||||
* Added a preflight script to help in development. This has no
|
||||
effect on end-users.
|
||||
* Removed a bit of debugging output that was accidentally left
|
||||
in.
|
||||
* Moved the 'populate' script to 'nsrlparse' and added it to
|
||||
the list of installed files
|
||||
* MD5 is now fully supported, as is interoperability with
|
||||
md5deep.
|
||||
|
||||
1.0.3: January 3, 2012
|
||||
* Fixed an interoperability bug with sha1deep.
|
||||
|
||||
1.0.2: December 30, 2011
|
||||
* Ubuntu 11.10 complains about handler.cc, on account of how
|
||||
there are some write() calls that aren't checked for returning
|
||||
a -1. Virtually all of those were superfluous warnings: one
|
||||
could possibly have created an intermittent error sooner or
|
||||
later. They have all been patched, and it now compiles cleanly
|
||||
on Ubuntu 11.10.
|
||||
|
||||
1.0.1: December 30, 2011
|
||||
* nsrllookup had a bug that would become manifest while querying
|
||||
millions of records. Now nsrllookup breaks it up into blocks
|
||||
of 4096 queries (a maximum of 164k of data per connection).
|
||||
This will hopefully improve performance for those times when
|
||||
you want to push millions of queries to the server.
|
||||
|
||||
1.0: December 30, 2011
|
||||
* First ready-for-the-users release. The only new feature over
|
||||
the release candidate series is a much improved installation
|
||||
procedure.
|
||||
* It should be possible to make RPMs, Debian packages, or
|
||||
what-have-you, since the install process is now bog-standard
|
||||
GNU ./configure && make && make install
|
||||
|
||||
1.0rcX: December, 2011
|
||||
* Ready for limited beta testing. The only change visible to
|
||||
end users was introducing support for OS X 10.6.
|
||||
* A bug that prevented reliable functioning on Fedora and OpenSUSE
|
||||
was found and crushed.
|
||||
* The internals were ported from a very C-like C++ subset to a
|
||||
much more C++ code style. This reduced our dependency on GNU
|
||||
getline(), which had been the major obstacle to OS X 10.6
|
||||
support.
|
||||
|
||||
0.9: December, 2011
|
||||
* Successfully tested nsrlsvr with the full NIST NSRL RDS on a 4Gb
|
||||
Apple iMac. It made it completely unusable as a desktop, but was
|
||||
able to successfully service requests.
|
||||
|
||||
56
INSTALL
Normal file
56
INSTALL
Normal file
@@ -0,0 +1,56 @@
|
||||
Installation instructions:
|
||||
|
||||
1. Decide what data set you want nsrlsvr to query against by
|
||||
default. Your options are:
|
||||
|
||||
a. NIST's NSRL RDS (http://http://www.nsrl.nist.gov/).
|
||||
|
||||
b. A dataset that you provide at compile-time. For instance,
|
||||
if you have a proprietary set of SHA-1 hashes of known
|
||||
malware and know you'll only ever want to use that, this is
|
||||
the way to go.
|
||||
|
||||
You may also tell nsrlsvr to use a different file by passing the
|
||||
"-f" flag when launching the server. This file must contain nothing
|
||||
but MD5, SHA-1 or SHA-256 hashes, one per line in hexadecimal format,
|
||||
with no other content on a line. This option is mostly for developer
|
||||
testing: most users will never touch it.
|
||||
|
||||
|
||||
2. If you're compiling it using your own dataset, your dataset must
|
||||
be in a format nsrlsvr understands. One good way to do this is with
|
||||
Jesse Kornblum's md5deep tool:
|
||||
|
||||
$ md5deep -c [FILES] > my_dataset.txt
|
||||
|
||||
|
||||
3. Run the ./configure script, passing it one or more of:
|
||||
|
||||
a. No options: if the current RDS zipfile exists in the build
|
||||
directory, use that; otherwise, try to download it.
|
||||
b. --with-custom=my_set.txt: use your own dataset
|
||||
c. --with-nsrl=filename: use an already-downloaded NSRL RDS
|
||||
zip file (one that lives, e.g., outside the build dir).
|
||||
You will want to use this option if a newer NSRL RDS has
|
||||
been released than the one nsrlsvr knows about.
|
||||
|
||||
|
||||
4. Once you've completed the "make && make install" dance, an
|
||||
executables will be installed to $PREFIX/bin: nsrlsvr, the server
|
||||
application, which runs as a UNIX daemon
|
||||
|
||||
|
||||
5. As an example of how it can be used:
|
||||
|
||||
$ md5deep -c /path/to/evil/files > evil_dataset.txt
|
||||
$ ./configure --with-custom=evil_dataset.txt
|
||||
$ make
|
||||
$ sudo make install
|
||||
|
||||
You've now created a custom dataset that contains MD5 hashes
|
||||
of files you've declared to be evil.
|
||||
|
||||
$ nsrlsvr -t 1800
|
||||
|
||||
You've started the server and instructed it to automatically
|
||||
shut down after a half-hour of inactivity.
|
||||
19
INSTALL.GIT
Normal file
19
INSTALL.GIT
Normal file
@@ -0,0 +1,19 @@
|
||||
If you're reading this, then you're using a Subversion snapshot of
|
||||
nsrlsvr. Please check your configure.in script to ensure the
|
||||
version has "svn" after it. If it doesn't, please holler at me
|
||||
that I've got a broken version string. :)
|
||||
|
||||
Building from Subversion sources is not recommended. At any given
|
||||
moment the tree may be broken. That said, if you want to live on
|
||||
the edge, go for it.
|
||||
|
||||
1. Do an 'svn up'. Don't assume that just because you checked the
|
||||
code out yesterday that it's still the same today. Seriously,
|
||||
svn up.
|
||||
2. 'sh ./bootstrap.sh'. The Subversion tree does not include a
|
||||
configure script. If you have a configure script in your
|
||||
directory, then it is something you created and it may no longer
|
||||
be in sync with changes to the tree. Running the bootstrap
|
||||
script will create a new configure script for you.
|
||||
3. Once you've recreated the configure script, build it just as
|
||||
you would a released version.
|
||||
13
LICENSE
Normal file
13
LICENSE
Normal file
@@ -0,0 +1,13 @@
|
||||
Copyright (c) 2011-2013, Robert J. Hansen <rjh@secret-alchemy.com>
|
||||
|
||||
Permission to use, copy, modify, and/or distribute this software for any
|
||||
purpose with or without fee is hereby granted, provided that the above
|
||||
copyright notice and this permission notice appear in all copies.
|
||||
|
||||
THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
|
||||
WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
|
||||
MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
|
||||
ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
|
||||
WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
|
||||
ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
|
||||
OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
|
||||
3
Makefile.am
Normal file
3
Makefile.am
Normal file
@@ -0,0 +1,3 @@
|
||||
EXTRA_DIST=LICENSE README CHANGELOG AUTHORS INSTALL THANKS convert-format.py denistify.py
|
||||
SUBDIRS=src man
|
||||
ACLOCAL_AMFLAGS=-I m4
|
||||
11
README.md
11
README.md
@@ -1,2 +1,11 @@
|
||||
nsrlsvr
|
||||
=======
|
||||
=======
|
||||
nsrlsvr is a tool to facilitate looking up data in the National Software
|
||||
Reference Library (NSRL). It's in a 1.1 state, which means it's unlikely
|
||||
to break in two if you look at it the wrong way but still may not be as
|
||||
stable as you'd like.
|
||||
|
||||
Installation instructions are found in the INSTALL file. Please read them.
|
||||
Due to the size of the NSRL's reference data set (RDS), installing nsrlsvr
|
||||
is a little bit more involved than one would like. It isn't hard: it's just
|
||||
not quite a configure, make, make install dance.
|
||||
|
||||
14
THANKS
Normal file
14
THANKS
Normal file
@@ -0,0 +1,14 @@
|
||||
* RedJack Security <http://www.redjack.com>
|
||||
- RedJack's been kind enough to let me hack on nsrlquery during
|
||||
business hours during lulls in work. Thanks, guys. It's immensely
|
||||
appreciated.
|
||||
|
||||
* Jesse Kornblum <jessekornblum@gmail.com>
|
||||
- Proposed the original "you know, there ought to be a way..." that
|
||||
led to nsrlquery
|
||||
- helped make nsrlquery work on OS X 10.6
|
||||
- noticed more bugs than can quickly be listed here :)
|
||||
|
||||
* Mark Kealiher <mkealiher@gmail.com>
|
||||
- Early adopters get to bleed on the cutting edge, and he shed more
|
||||
than his due. Thanks, Mark. Hopefully it works better now. :)
|
||||
5
bootstrap.sh
Executable file
5
bootstrap.sh
Executable file
@@ -0,0 +1,5 @@
|
||||
#!/bin/sh
|
||||
aclocal -I m4
|
||||
automake --foreign --add-missing
|
||||
autoheader
|
||||
autoconf
|
||||
163
config.h.in
Normal file
163
config.h.in
Normal file
@@ -0,0 +1,163 @@
|
||||
/* config.h.in. Generated from configure.ac by autoheader. */
|
||||
|
||||
/* Define to 1 if you have the <arpa/inet.h> header file. */
|
||||
#undef HAVE_ARPA_INET_H
|
||||
|
||||
/* Define to 1 if you have the `fork' function. */
|
||||
#undef HAVE_FORK
|
||||
|
||||
/* Define to 1 if you have the `inet_ntoa' function. */
|
||||
#undef HAVE_INET_NTOA
|
||||
|
||||
/* Define to 1 if the system has the type `intmax_t'. */
|
||||
#undef HAVE_INTMAX_T
|
||||
|
||||
/* Define to 1 if you have the <inttypes.h> header file. */
|
||||
#undef HAVE_INTTYPES_H
|
||||
|
||||
/* Define to 1 if you have the <limits.h> header file. */
|
||||
#undef HAVE_LIMITS_H
|
||||
|
||||
/* Define to 1 if the system has the type `long long int'. */
|
||||
#undef HAVE_LONG_LONG_INT
|
||||
|
||||
/* Define to 1 if you have the <memory.h> header file. */
|
||||
#undef HAVE_MEMORY_H
|
||||
|
||||
/* Define to 1 if you have the `memset' function. */
|
||||
#undef HAVE_MEMSET
|
||||
|
||||
/* Define to 1 if you have the <netinet/in.h> header file. */
|
||||
#undef HAVE_NETINET_IN_H
|
||||
|
||||
/* Define if you have POSIX threads libraries and header files. */
|
||||
#undef HAVE_PTHREAD
|
||||
|
||||
/* Define to 1 if you have the `socket' function. */
|
||||
#undef HAVE_SOCKET
|
||||
|
||||
/* Define to 1 if stdbool.h conforms to C99. */
|
||||
#undef HAVE_STDBOOL_H
|
||||
|
||||
/* Define to 1 if you have the <stdint.h> header file. */
|
||||
#undef HAVE_STDINT_H
|
||||
|
||||
/* Define to 1 if you have the <stdlib.h> header file. */
|
||||
#undef HAVE_STDLIB_H
|
||||
|
||||
/* Define to 1 if you have the <strings.h> header file. */
|
||||
#undef HAVE_STRINGS_H
|
||||
|
||||
/* Define to 1 if you have the <string.h> header file. */
|
||||
#undef HAVE_STRING_H
|
||||
|
||||
/* Define to 1 if you have the <syslog.h> header file. */
|
||||
#undef HAVE_SYSLOG_H
|
||||
|
||||
/* Define to 1 if you have the <sys/socket.h> header file. */
|
||||
#undef HAVE_SYS_SOCKET_H
|
||||
|
||||
/* Define to 1 if you have the <sys/stat.h> header file. */
|
||||
#undef HAVE_SYS_STAT_H
|
||||
|
||||
/* Define to 1 if you have the <sys/types.h> header file. */
|
||||
#undef HAVE_SYS_TYPES_H
|
||||
|
||||
/* Define to 1 if you have the <unistd.h> header file. */
|
||||
#undef HAVE_UNISTD_H
|
||||
|
||||
/* Define to 1 if you have the `vfork' function. */
|
||||
#undef HAVE_VFORK
|
||||
|
||||
/* Define to 1 if you have the <vfork.h> header file. */
|
||||
#undef HAVE_VFORK_H
|
||||
|
||||
/* Define to 1 if `fork' works. */
|
||||
#undef HAVE_WORKING_FORK
|
||||
|
||||
/* Define to 1 if `vfork' works. */
|
||||
#undef HAVE_WORKING_VFORK
|
||||
|
||||
/* Define to 1 if the system has the type `_Bool'. */
|
||||
#undef HAVE__BOOL
|
||||
|
||||
/* Name of package */
|
||||
#undef PACKAGE
|
||||
|
||||
/* Define to the address where bug reports for this package should be sent. */
|
||||
#undef PACKAGE_BUGREPORT
|
||||
|
||||
/* Define to the full name of this package. */
|
||||
#undef PACKAGE_NAME
|
||||
|
||||
/* Define to the full name and version of this package. */
|
||||
#undef PACKAGE_STRING
|
||||
|
||||
/* Define to the one symbol short name of this package. */
|
||||
#undef PACKAGE_TARNAME
|
||||
|
||||
/* Define to the version of this package. */
|
||||
#undef PACKAGE_VERSION
|
||||
|
||||
/* Define to necessary symbol if this constant uses a non-standard name on
|
||||
your system. */
|
||||
#undef PTHREAD_CREATE_JOINABLE
|
||||
|
||||
/* Define to 1 if you have the ANSI C header files. */
|
||||
#undef STDC_HEADERS
|
||||
|
||||
/* Version number of package */
|
||||
#undef VERSION
|
||||
|
||||
/* Define for Solaris 2.5.1 so the uint32_t typedef from <sys/synch.h>,
|
||||
<pthread.h>, or <semaphore.h> is not used. If the typedef was allowed, the
|
||||
#define below would cause a syntax error. */
|
||||
#undef _UINT32_T
|
||||
|
||||
/* Define for Solaris 2.5.1 so the uint8_t typedef from <sys/synch.h>,
|
||||
<pthread.h>, or <semaphore.h> is not used. If the typedef was allowed, the
|
||||
#define below would cause a syntax error. */
|
||||
#undef _UINT8_T
|
||||
|
||||
/* Define to empty if `const' does not conform to ANSI C. */
|
||||
#undef const
|
||||
|
||||
/* Define to the type of a signed integer type of width exactly 16 bits if
|
||||
such a type exists and the standard includes do not define it. */
|
||||
#undef int16_t
|
||||
|
||||
/* Define to the type of a signed integer type of width exactly 32 bits if
|
||||
such a type exists and the standard includes do not define it. */
|
||||
#undef int32_t
|
||||
|
||||
/* Define to the type of a signed integer type of width exactly 8 bits if such
|
||||
a type exists and the standard includes do not define it. */
|
||||
#undef int8_t
|
||||
|
||||
/* Define to the widest signed integer type if <stdint.h> and <inttypes.h> do
|
||||
not define. */
|
||||
#undef intmax_t
|
||||
|
||||
/* Define to `int' if <sys/types.h> does not define. */
|
||||
#undef pid_t
|
||||
|
||||
/* Define to `unsigned int' if <sys/types.h> does not define. */
|
||||
#undef size_t
|
||||
|
||||
/* Define to `int' if <sys/types.h> does not define. */
|
||||
#undef ssize_t
|
||||
|
||||
/* Define to the type of an unsigned integer type of width exactly 16 bits if
|
||||
such a type exists and the standard includes do not define it. */
|
||||
#undef uint16_t
|
||||
|
||||
/* Define to the type of an unsigned integer type of width exactly 32 bits if
|
||||
such a type exists and the standard includes do not define it. */
|
||||
#undef uint32_t
|
||||
|
||||
/* Define to the type of an unsigned integer type of width exactly 8 bits if
|
||||
such a type exists and the standard includes do not define it. */
|
||||
#undef uint8_t
|
||||
|
||||
/* Define as `fork' if `vfork' does not work. */
|
||||
#undef vfork
|
||||
118
configure.ac
Normal file
118
configure.ac
Normal file
@@ -0,0 +1,118 @@
|
||||
AC_INIT([NSRL Server], [1.1.2], [Robert J. Hansen <rjh@secret-alchemy.com>], [nsrlsvr], [http://nsrlquery.sourceforge.net])
|
||||
AC_ARG_WITH([nsrl],
|
||||
[AS_HELP_STRING([--with-nsrl],
|
||||
[use NIST's NSRL RDS @<:@default: use the NSRL RDS@:>@])],
|
||||
[nsrl=${withval}], [nsrl=no])
|
||||
|
||||
AC_ARG_WITH([custom],
|
||||
[AS_HELP_STRING([--with-custom],
|
||||
[use a custom dataset @<:@default: don't@:>@])],
|
||||
[custom=${withval}], [custom=no])
|
||||
|
||||
if test "x$custom" != "xno" ; then
|
||||
AM_PATH_PYTHON([2.7])
|
||||
fi
|
||||
|
||||
if test "x$nsrl" != "xno" && test "x$custom" != "xno" ; then
|
||||
AC_MSG_ERROR([The --with-nsrl and --with-custom flags are mutually exclusive.]);
|
||||
fi
|
||||
|
||||
AC_CONFIG_MACRO_DIR([m4])
|
||||
AC_CONFIG_SRCDIR([src/main.cc])
|
||||
AC_PREREQ([2.58])
|
||||
AC_CONFIG_HEADERS([config.h])
|
||||
AM_INIT_AUTOMAKE([foreign])
|
||||
m4_ifdef([AM_SILENT_RULES], [AM_SILENT_RULES([yes])])
|
||||
|
||||
AC_PROG_CXX
|
||||
AC_TYPE_INT8_T
|
||||
AC_TYPE_UINT8_T
|
||||
AC_TYPE_INT16_T
|
||||
AC_TYPE_UINT16_T
|
||||
AC_TYPE_INT32_T
|
||||
AC_TYPE_UINT32_T
|
||||
AC_TYPE_INTMAX_T
|
||||
ACX_PTHREAD([], AC_MSG_ERROR([pthreads does not appear usable.]))
|
||||
AC_CHECK_FUNCS([inet_ntoa])
|
||||
AC_CHECK_FUNCS([memset])
|
||||
AC_CHECK_FUNCS([socket])
|
||||
AC_CHECK_HEADERS([arpa/inet.h])
|
||||
AC_CHECK_HEADERS([limits.h])
|
||||
AC_CHECK_HEADERS([netinet/in.h])
|
||||
AC_CHECK_HEADERS([sys/socket.h])
|
||||
AC_CHECK_HEADERS([syslog.h])
|
||||
AC_C_CONST
|
||||
AC_FUNC_FORK
|
||||
dnl AC_FUNC_GETLOADAVG
|
||||
AC_HEADER_STDBOOL
|
||||
AC_TYPE_PID_T
|
||||
AC_TYPE_SIZE_T
|
||||
AC_TYPE_SSIZE_T
|
||||
|
||||
RDS_URL=http://www.nsrl.nist.gov/RDS/rds_2.39/RDS_239m.zip
|
||||
nsrl_filename=RDS_239m.zip
|
||||
|
||||
|
||||
if test "x$nsrl" != xno ; then
|
||||
if ! test -r $nsrl ; then
|
||||
AC_MSG_ERROR([Couldn't find the dataset specified.])
|
||||
else
|
||||
nsrl_filename = $nsrl
|
||||
fi
|
||||
fi
|
||||
|
||||
if ! test -r $nsrl_filename && test "x$custom" = xno ; then
|
||||
AC_CHECK_PROG([UNZIP], [unzip], [unzip], AC_MSG_ERROR([unzip not found: this is necessary to use the downloaded NIST NSRL RDS]))
|
||||
AC_CHECK_PROG([WGET], [wget], [wget], [no])
|
||||
if test "x$WGET" = xwget ; then
|
||||
wget $RDS_URL ;
|
||||
else
|
||||
AC_CHECK_PROG([CURL], [curl], [curl], [no])
|
||||
if test "x$CURL" = xcurl ; then
|
||||
curl -O $RDS_URL ;
|
||||
else
|
||||
AC_MSG_ERROR([The NIST NSRL RDS must be downloaded, but neither curl nor wget are in your PATH. Please fix this, and try again.])
|
||||
fi
|
||||
fi
|
||||
AC_MSG_NOTICE([
|
||||
***
|
||||
*** I'm going to leave the file $nsrl_filename around in the toplevel of the
|
||||
*** build directory. If you leave it here, the next time you build this it
|
||||
*** will save you a long download.
|
||||
***])
|
||||
fi
|
||||
|
||||
if test "x$custom" = xno ; then
|
||||
if ! test -r $nsrl_filename ; then
|
||||
AC_MSG_ERROR([
|
||||
***
|
||||
*** Couldn't open $nsrl_filename for reading.
|
||||
***
|
||||
*** If you used a tilde ("~") in the path, try giving a full directory
|
||||
*** path: sometimes tilde expansion confuses configure.
|
||||
***]);
|
||||
else
|
||||
AC_MSG_NOTICE([uncompressing the NSRL RDS -- this may take a while...])
|
||||
rm -f NSRLFile.txt src/NSRLFile.txt
|
||||
unzip -o $nsrl_filename NSRLFile.txt
|
||||
AC_MSG_NOTICE([converting into nsrlsvr's data format -- please wait...])
|
||||
$PYTHON ./denistify.py
|
||||
rm -f NSRLFile.txt
|
||||
fi
|
||||
else
|
||||
if ! test -r $custom ; then
|
||||
AC_MSG_ERROR([
|
||||
***
|
||||
*** Couldn't open $custom for reading.
|
||||
***
|
||||
*** If you used a tilde ("~") in the path, try giving a full directory
|
||||
*** path: sometimes tilde expansion confuses configure.
|
||||
***]);
|
||||
fi
|
||||
AC_MSG_NOTICE([converting $custom to the proper data format -- please wait...])
|
||||
rm -f src/NSRLFile.txt
|
||||
$PYTHON ./convert-format.py $custom
|
||||
fi
|
||||
|
||||
AC_OUTPUT([Makefile src/Makefile man/Makefile])
|
||||
|
||||
30
convert-format.py
Executable file
30
convert-format.py
Executable file
@@ -0,0 +1,30 @@
|
||||
#!/usr/bin/env python
|
||||
|
||||
from __future__ import print_function
|
||||
|
||||
import re, sys, os
|
||||
|
||||
hash_re = re.compile(r"([0-9A-Fa-f]{64}|[0-9A-Fa-f]{40}|[0-9A-Fa-f]{32})")
|
||||
|
||||
if len(sys.argv) != 2:
|
||||
print("No file specified.")
|
||||
exit(-1)
|
||||
if not os.access(sys.argv[1], os.R_OK):
|
||||
print("Couldn't read " + sys.argv[1])
|
||||
exit(-2)
|
||||
|
||||
with open(sys.argv[1]) as fh:
|
||||
hashes = [hash_re.search(X).group(1) for X in fh.readlines() if hash_re.search(X)]
|
||||
|
||||
if not hashes:
|
||||
print("Zero hashes found -- check to see if this is correct.")
|
||||
exit(-4)
|
||||
|
||||
first_len = len(hashes[0])
|
||||
if [X for X in hashes[1:] if len(X) != first_len]:
|
||||
print("Multiple different hash algorithms present in " + sys.argv[1])
|
||||
exit(-8)
|
||||
|
||||
with open("src/NSRLFile.txt", "w") as output:
|
||||
for hash in hashes:
|
||||
output.write(hash + "\n")
|
||||
23
denistify.py
Executable file
23
denistify.py
Executable file
@@ -0,0 +1,23 @@
|
||||
#!/usr/bin/env python
|
||||
#coding=UTF-8
|
||||
|
||||
import re, sys
|
||||
|
||||
md5_re = re.compile('^.*"([0-9A-Fa-f]{32})".*$')
|
||||
hashes = []
|
||||
count = 0
|
||||
|
||||
with open("NSRLFile.txt") as fh:
|
||||
line = fh.readline()
|
||||
while line:
|
||||
elements = line.split(",")
|
||||
if len(elements) >= 2:
|
||||
match = md5_re.match(elements[1])
|
||||
if match:
|
||||
hashes.append(match.group(1))
|
||||
line = fh.readline()
|
||||
|
||||
hashes.sort()
|
||||
with open("src/NSRLFile.txt", "w") as fh:
|
||||
for entry in hashes:
|
||||
fh.write(entry + "\n")
|
||||
242
m4/acx_pthread.m4
Normal file
242
m4/acx_pthread.m4
Normal file
@@ -0,0 +1,242 @@
|
||||
dnl @synopsis ACX_PTHREAD([ACTION-IF-FOUND[, ACTION-IF-NOT-FOUND]])
|
||||
dnl
|
||||
dnl @summary figure out how to build C programs using POSIX threads
|
||||
dnl
|
||||
dnl This macro figures out how to build C programs using POSIX threads.
|
||||
dnl It sets the PTHREAD_LIBS output variable to the threads library and
|
||||
dnl linker flags, and the PTHREAD_CFLAGS output variable to any special
|
||||
dnl C compiler flags that are needed. (The user can also force certain
|
||||
dnl compiler flags/libs to be tested by setting these environment
|
||||
dnl variables.)
|
||||
dnl
|
||||
dnl Also sets PTHREAD_CC to any special C compiler that is needed for
|
||||
dnl multi-threaded programs (defaults to the value of CC otherwise).
|
||||
dnl (This is necessary on AIX to use the special cc_r compiler alias.)
|
||||
dnl
|
||||
dnl NOTE: You are assumed to not only compile your program with these
|
||||
dnl flags, but also link it with them as well. e.g. you should link
|
||||
dnl with $PTHREAD_CC $CFLAGS $PTHREAD_CFLAGS $LDFLAGS ... $PTHREAD_LIBS
|
||||
dnl $LIBS
|
||||
dnl
|
||||
dnl If you are only building threads programs, you may wish to use
|
||||
dnl these variables in your default LIBS, CFLAGS, and CC:
|
||||
dnl
|
||||
dnl LIBS="$PTHREAD_LIBS $LIBS"
|
||||
dnl CFLAGS="$CFLAGS $PTHREAD_CFLAGS"
|
||||
dnl CC="$PTHREAD_CC"
|
||||
dnl
|
||||
dnl In addition, if the PTHREAD_CREATE_JOINABLE thread-attribute
|
||||
dnl constant has a nonstandard name, defines PTHREAD_CREATE_JOINABLE to
|
||||
dnl that name (e.g. PTHREAD_CREATE_UNDETACHED on AIX).
|
||||
dnl
|
||||
dnl ACTION-IF-FOUND is a list of shell commands to run if a threads
|
||||
dnl library is found, and ACTION-IF-NOT-FOUND is a list of commands to
|
||||
dnl run it if it is not found. If ACTION-IF-FOUND is not specified, the
|
||||
dnl default action will define HAVE_PTHREAD.
|
||||
dnl
|
||||
dnl Please let the authors know if this macro fails on any platform, or
|
||||
dnl if you have any other suggestions or comments. This macro was based
|
||||
dnl on work by SGJ on autoconf scripts for FFTW (www.fftw.org) (with
|
||||
dnl help from M. Frigo), as well as ac_pthread and hb_pthread macros
|
||||
dnl posted by Alejandro Forero Cuervo to the autoconf macro repository.
|
||||
dnl We are also grateful for the helpful feedback of numerous users.
|
||||
dnl
|
||||
dnl @category InstalledPackages
|
||||
dnl @author Steven G. Johnson <stevenj@alum.mit.edu>
|
||||
dnl @version 2006-05-29
|
||||
dnl @license GPLWithACException
|
||||
|
||||
AC_DEFUN([ACX_PTHREAD], [
|
||||
AC_REQUIRE([AC_CANONICAL_HOST])
|
||||
AC_LANG_SAVE
|
||||
AC_LANG_C
|
||||
acx_pthread_ok=no
|
||||
|
||||
# We used to check for pthread.h first, but this fails if pthread.h
|
||||
# requires special compiler flags (e.g. on True64 or Sequent).
|
||||
# It gets checked for in the link test anyway.
|
||||
|
||||
# First of all, check if the user has set any of the PTHREAD_LIBS,
|
||||
# etcetera environment variables, and if threads linking works using
|
||||
# them:
|
||||
if test x"$PTHREAD_LIBS$PTHREAD_CFLAGS" != x; then
|
||||
save_CFLAGS="$CFLAGS"
|
||||
CFLAGS="$CFLAGS $PTHREAD_CFLAGS"
|
||||
save_LIBS="$LIBS"
|
||||
LIBS="$PTHREAD_LIBS $LIBS"
|
||||
AC_MSG_CHECKING([for pthread_join in LIBS=$PTHREAD_LIBS with CFLAGS=$PTHREAD_CFLAGS])
|
||||
AC_TRY_LINK_FUNC(pthread_join, acx_pthread_ok=yes)
|
||||
AC_MSG_RESULT($acx_pthread_ok)
|
||||
if test x"$acx_pthread_ok" = xno; then
|
||||
PTHREAD_LIBS=""
|
||||
PTHREAD_CFLAGS=""
|
||||
fi
|
||||
LIBS="$save_LIBS"
|
||||
CFLAGS="$save_CFLAGS"
|
||||
fi
|
||||
|
||||
# We must check for the threads library under a number of different
|
||||
# names; the ordering is very important because some systems
|
||||
# (e.g. DEC) have both -lpthread and -lpthreads, where one of the
|
||||
# libraries is broken (non-POSIX).
|
||||
|
||||
# Create a list of thread flags to try. Items starting with a "-" are
|
||||
# C compiler flags, and other items are library names, except for "none"
|
||||
# which indicates that we try without any flags at all, and "pthread-config"
|
||||
# which is a program returning the flags for the Pth emulation library.
|
||||
|
||||
acx_pthread_flags="pthreads none -Kthread -kthread lthread -pthread -pthreads -mthreads pthread --thread-safe -mt pthread-config"
|
||||
|
||||
# The ordering *is* (sometimes) important. Some notes on the
|
||||
# individual items follow:
|
||||
|
||||
# pthreads: AIX (must check this before -lpthread)
|
||||
# none: in case threads are in libc; should be tried before -Kthread and
|
||||
# other compiler flags to prevent continual compiler warnings
|
||||
# -Kthread: Sequent (threads in libc, but -Kthread needed for pthread.h)
|
||||
# -kthread: FreeBSD kernel threads (preferred to -pthread since SMP-able)
|
||||
# lthread: LinuxThreads port on FreeBSD (also preferred to -pthread)
|
||||
# -pthread: Linux/gcc (kernel threads), BSD/gcc (userland threads)
|
||||
# -pthreads: Solaris/gcc
|
||||
# -mthreads: Mingw32/gcc, Lynx/gcc
|
||||
# -mt: Sun Workshop C (may only link SunOS threads [-lthread], but it
|
||||
# doesn't hurt to check since this sometimes defines pthreads too;
|
||||
# also defines -D_REENTRANT)
|
||||
# ... -mt is also the pthreads flag for HP/aCC
|
||||
# pthread: Linux, etcetera
|
||||
# --thread-safe: KAI C++
|
||||
# pthread-config: use pthread-config program (for GNU Pth library)
|
||||
|
||||
case "${host_cpu}-${host_os}" in
|
||||
*solaris*)
|
||||
|
||||
# On Solaris (at least, for some versions), libc contains stubbed
|
||||
# (non-functional) versions of the pthreads routines, so link-based
|
||||
# tests will erroneously succeed. (We need to link with -pthreads/-mt/
|
||||
# -lpthread.) (The stubs are missing pthread_cleanup_push, or rather
|
||||
# a function called by this macro, so we could check for that, but
|
||||
# who knows whether they'll stub that too in a future libc.) So,
|
||||
# we'll just look for -pthreads and -lpthread first:
|
||||
|
||||
acx_pthread_flags="-pthreads pthread -mt -pthread $acx_pthread_flags"
|
||||
;;
|
||||
esac
|
||||
|
||||
if test x"$acx_pthread_ok" = xno; then
|
||||
for flag in $acx_pthread_flags; do
|
||||
|
||||
case $flag in
|
||||
none)
|
||||
AC_MSG_CHECKING([whether pthreads work without any flags])
|
||||
;;
|
||||
|
||||
-*)
|
||||
AC_MSG_CHECKING([whether pthreads work with $flag])
|
||||
PTHREAD_CFLAGS="$flag"
|
||||
;;
|
||||
|
||||
pthread-config)
|
||||
AC_CHECK_PROG(acx_pthread_config, pthread-config, yes, no)
|
||||
if test x"$acx_pthread_config" = xno; then continue; fi
|
||||
PTHREAD_CFLAGS="`pthread-config --cflags`"
|
||||
PTHREAD_LIBS="`pthread-config --ldflags` `pthread-config --libs`"
|
||||
;;
|
||||
|
||||
*)
|
||||
AC_MSG_CHECKING([for the pthreads library -l$flag])
|
||||
PTHREAD_LIBS="-l$flag"
|
||||
;;
|
||||
esac
|
||||
|
||||
save_LIBS="$LIBS"
|
||||
save_CFLAGS="$CFLAGS"
|
||||
LIBS="$PTHREAD_LIBS $LIBS"
|
||||
CFLAGS="$CFLAGS $PTHREAD_CFLAGS"
|
||||
|
||||
# Check for various functions. We must include pthread.h,
|
||||
# since some functions may be macros. (On the Sequent, we
|
||||
# need a special flag -Kthread to make this header compile.)
|
||||
# We check for pthread_join because it is in -lpthread on IRIX
|
||||
# while pthread_create is in libc. We check for pthread_attr_init
|
||||
# due to DEC craziness with -lpthreads. We check for
|
||||
# pthread_cleanup_push because it is one of the few pthread
|
||||
# functions on Solaris that doesn't have a non-functional libc stub.
|
||||
# We try pthread_create on general principles.
|
||||
AC_TRY_LINK([#include <pthread.h>],
|
||||
[pthread_t th; pthread_join(th, 0);
|
||||
pthread_attr_init(0); pthread_cleanup_push(0, 0);
|
||||
pthread_create(0,0,0,0); pthread_cleanup_pop(0); ],
|
||||
[acx_pthread_ok=yes])
|
||||
|
||||
LIBS="$save_LIBS"
|
||||
CFLAGS="$save_CFLAGS"
|
||||
|
||||
AC_MSG_RESULT($acx_pthread_ok)
|
||||
if test "x$acx_pthread_ok" = xyes; then
|
||||
break;
|
||||
fi
|
||||
|
||||
PTHREAD_LIBS=""
|
||||
PTHREAD_CFLAGS=""
|
||||
done
|
||||
fi
|
||||
|
||||
# Various other checks:
|
||||
if test "x$acx_pthread_ok" = xyes; then
|
||||
save_LIBS="$LIBS"
|
||||
LIBS="$PTHREAD_LIBS $LIBS"
|
||||
save_CFLAGS="$CFLAGS"
|
||||
CFLAGS="$CFLAGS $PTHREAD_CFLAGS"
|
||||
|
||||
# Detect AIX lossage: JOINABLE attribute is called UNDETACHED.
|
||||
AC_MSG_CHECKING([for joinable pthread attribute])
|
||||
attr_name=unknown
|
||||
for attr in PTHREAD_CREATE_JOINABLE PTHREAD_CREATE_UNDETACHED; do
|
||||
AC_TRY_LINK([#include <pthread.h>], [int attr=$attr; return attr;],
|
||||
[attr_name=$attr; break])
|
||||
done
|
||||
AC_MSG_RESULT($attr_name)
|
||||
if test "$attr_name" != PTHREAD_CREATE_JOINABLE; then
|
||||
AC_DEFINE_UNQUOTED(PTHREAD_CREATE_JOINABLE, $attr_name,
|
||||
[Define to necessary symbol if this constant
|
||||
uses a non-standard name on your system.])
|
||||
fi
|
||||
|
||||
AC_MSG_CHECKING([if more special flags are required for pthreads])
|
||||
flag=no
|
||||
case "${host_cpu}-${host_os}" in
|
||||
*-aix* | *-freebsd* | *-darwin*) flag="-D_THREAD_SAFE";;
|
||||
*solaris* | *-osf* | *-hpux*) flag="-D_REENTRANT";;
|
||||
esac
|
||||
AC_MSG_RESULT(${flag})
|
||||
if test "x$flag" != xno; then
|
||||
PTHREAD_CFLAGS="$flag $PTHREAD_CFLAGS"
|
||||
fi
|
||||
|
||||
LIBS="$save_LIBS"
|
||||
CFLAGS="$save_CFLAGS"
|
||||
|
||||
# More AIX lossage: must compile with xlc_r or cc_r
|
||||
if test x"$GCC" != xyes; then
|
||||
AC_CHECK_PROGS(PTHREAD_CC, xlc_r cc_r, ${CC})
|
||||
else
|
||||
PTHREAD_CC=$CC
|
||||
fi
|
||||
else
|
||||
PTHREAD_CC="$CC"
|
||||
fi
|
||||
|
||||
AC_SUBST(PTHREAD_LIBS)
|
||||
AC_SUBST(PTHREAD_CFLAGS)
|
||||
AC_SUBST(PTHREAD_CC)
|
||||
|
||||
# Finally, execute ACTION-IF-FOUND/ACTION-IF-NOT-FOUND:
|
||||
if test x"$acx_pthread_ok" = xyes; then
|
||||
ifelse([$1],,AC_DEFINE(HAVE_PTHREAD,1,[Define if you have POSIX threads libraries and header files.]),[$1])
|
||||
:
|
||||
else
|
||||
acx_pthread_ok=no
|
||||
$2
|
||||
fi
|
||||
AC_LANG_RESTORE
|
||||
])dnl ACX_PTHREAD
|
||||
2
man/Makefile.am
Normal file
2
man/Makefile.am
Normal file
@@ -0,0 +1,2 @@
|
||||
EXTRA_DIST=nsrlsvr.1
|
||||
man_MANS=nsrlsvr.1
|
||||
62
man/nsrlsvr.1
Normal file
62
man/nsrlsvr.1
Normal file
@@ -0,0 +1,62 @@
|
||||
.Dd January 30, 2012
|
||||
.Dt NSRLSVR 1
|
||||
.Os
|
||||
.Sh NAME
|
||||
.Nm nsrlsvr
|
||||
.Nd server yielding hashes from NIST's NSRL RDS
|
||||
.Sh SYNOPSIS
|
||||
.Nm nsrlsvr
|
||||
.Op Fl b
|
||||
.Op Fl h
|
||||
.Op Fl o
|
||||
.Op Fl s
|
||||
.Op Fl S
|
||||
.Op Fl v
|
||||
.Op Fl f Ar RDS-file
|
||||
.Op Fl p Ar port
|
||||
.Op Fl t Ar timeout
|
||||
.Sh DESCRIPTION
|
||||
nsrlsvr provides a daemon that services queries from clients requesting information
|
||||
about whether certain hash values are present in the NIST National Software Reference
|
||||
Laboratory Reference Data Set (NSRL RDS).
|
||||
.Sh OPTIONS
|
||||
.Bl -tag -width Ds
|
||||
.It Fl b
|
||||
show information on submitting bug reports, then exit
|
||||
.It Fl h
|
||||
show a help screen, then exit
|
||||
.It Fl o
|
||||
only support the old 1.0 server protocol
|
||||
.It Fl s
|
||||
allow clients to query the server status (default: disabled)
|
||||
.It Fl S
|
||||
run as a normal process (do not run as a daemon)
|
||||
.It Fl v
|
||||
show version information, then exit
|
||||
.It Fl f Ar RDS-file
|
||||
specify an alternate RDS file in
|
||||
.Ar RDS-file
|
||||
.It Fl p Ar port
|
||||
listen on port (default: 9120)
|
||||
.Ar port
|
||||
.It Fl t Ar timeout
|
||||
shut down after
|
||||
.Ar timeout
|
||||
seconds of inactivity (default: disabled)
|
||||
.El
|
||||
.Sh NOTES
|
||||
To support the full NSRL RDS requires a lot of memory. Although it will run on
|
||||
a 4Gb system, the results may be unsatisfactory. A 64-bit OS with at least 8Gb
|
||||
of RAM is recommended.
|
||||
.Pp
|
||||
nsrlsvr treats the
|
||||
.Ar timeout
|
||||
value as a guideline. It will not shut down before
|
||||
.Ar timeout
|
||||
seconds of inactivity, but it may allow up to thirty seconds more.
|
||||
.Sh BUGS
|
||||
None known.
|
||||
.Sh SEE ALSO
|
||||
nsrllookup(1)
|
||||
.Sh AUTHOR
|
||||
Robert J. Hansen <rjh@secret-alchemy.com>
|
||||
1757
src/Doxyfile
Normal file
1757
src/Doxyfile
Normal file
File diff suppressed because it is too large
Load Diff
6
src/Makefile.am
Normal file
6
src/Makefile.am
Normal file
@@ -0,0 +1,6 @@
|
||||
EXTRA_DIST = handler.hpp Doxyfile
|
||||
bin_PROGRAMS = nsrlsvr
|
||||
nodist_pkgdata_DATA = NSRLFile.txt
|
||||
nsrlsvr_SOURCES = main.cc handler.cc
|
||||
nsrlsvr_CPPFLAGS = -DPKGDATADIR="\"$(pkgdatadir)\"" -DPACKAGE_VERSION="\"$(PACKAGE_VERSION)\"" -DPACKAGE_URL="\"$(PACKAGE_URL)\"" -DPACKAGE_BUGREPORT="\"$(PACKAGE_BUGREPORT)\"" $(PTHREAD_CFLAGS)
|
||||
nsrlsvr_LDFLAGS=$(PTHREAD_CFLAGS)
|
||||
595
src/handler.cc
Normal file
595
src/handler.cc
Normal file
@@ -0,0 +1,595 @@
|
||||
/* $Id: handler.cc 142 2013-02-23 22:25:32Z rjh $
|
||||
*
|
||||
* Copyright (c) 2011-2012, Robert J. Hansen <rjh@secret-alchemy.com>
|
||||
* and others.
|
||||
*
|
||||
* Permission to use, copy, modify, and/or distribute this software for any
|
||||
* purpose with or without fee is hereby granted, provided that the above
|
||||
* copyright notice and this permission notice appear in all copies.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
|
||||
* WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
|
||||
* MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
|
||||
* ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
|
||||
* WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
|
||||
* ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
|
||||
* OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
|
||||
*
|
||||
* Code standards:
|
||||
* This is a small enough project we don't need a formal coding standard.
|
||||
* That said, here are some helpful tips for people who want to submit
|
||||
* patches:
|
||||
*
|
||||
* - If it's not 100% ISO C++98, it won't get in.
|
||||
* - It must compile cleanly and without warnings under both GNU G++
|
||||
* and Clang++, even with "-W -Wextra -ansi -pedantic".
|
||||
* - C++ offers 'and', 'or' and 'not' keywords instead of &&, || and !.
|
||||
* I like these: I think they're more readable. Please use them.
|
||||
* - C++ allows you to initialize variables at declaration time by
|
||||
* doing something like "int x(3)" instead of "int x = 3". Please
|
||||
* do this where practical: it's a good habit to get into for C++.
|
||||
* - Please try to follow the formatting conventions. It's mostly
|
||||
* straight-up astyle format, with occasional tweaks where necessary
|
||||
* to get nice hardcopy printouts.
|
||||
* - If you write a new function it must have a Doxygen block
|
||||
* documenting it.
|
||||
*
|
||||
* Contributor history:
|
||||
*
|
||||
* Robert J. Hansen <rjh@secret-alchemy.com>
|
||||
* - most everything
|
||||
* Jesse Kornblum <jessekornblum@gmail.com>
|
||||
* - patch to log how many hashes are in each QUERY statement
|
||||
*/
|
||||
|
||||
#include <string>
|
||||
#include <set>
|
||||
#include <vector>
|
||||
#include <algorithm>
|
||||
#include <functional>
|
||||
#include <memory>
|
||||
#include <exception>
|
||||
#include "handler.hpp"
|
||||
#include <poll.h>
|
||||
#include <cstdlib> // for getloadavg
|
||||
#include <sys/types.h>
|
||||
#include <syslog.h>
|
||||
#include <inttypes.h>
|
||||
|
||||
#define INFO LOG_MAKEPRI(LOG_USER, LOG_INFO)
|
||||
|
||||
/* Additional defines necessary on Linux: */
|
||||
#ifdef __linux__
|
||||
#include <cstring> // for memset
|
||||
#include <cstdio> // for snprintf
|
||||
#include <unistd.h> // because Fedora has lately taken to being weird
|
||||
#endif
|
||||
|
||||
using std::set;
|
||||
using std::string;
|
||||
using std::find;
|
||||
using std::find_if;
|
||||
using std::transform;
|
||||
using std::vector;
|
||||
using std::not1;
|
||||
using std::equal_to;
|
||||
using std::ptr_fun;
|
||||
using std::remove;
|
||||
using std::auto_ptr;
|
||||
using std::exception;
|
||||
|
||||
extern const set<string>& hashes;
|
||||
extern const bool& enable_status;
|
||||
extern const bool& only_old;
|
||||
|
||||
namespace {
|
||||
|
||||
/** A convenience exception representing network errors that cannot
|
||||
* be recovered from, and will result in a graceful bomb-out.
|
||||
*
|
||||
* @since 1.1
|
||||
* @author Robert J. Hansen */
|
||||
class UnrecoverableNetworkError : public exception
|
||||
{
|
||||
public:
|
||||
const char* what() const throw() {
|
||||
return "unr net err";
|
||||
}
|
||||
};
|
||||
|
||||
|
||||
/** A functor that provides stateful reading of line-oriented data
|
||||
* across UNIX file descriptors.
|
||||
*
|
||||
* The big problem with reading information over a socket
|
||||
* connection is that data can arrive in a badly fragmented form.
|
||||
* On a console you can just call getline() and be confident that
|
||||
* when it returns there will be a CR/LF at the end and no data
|
||||
* afterwards: that's the great virtue of accepting data one byte
|
||||
* at a time on a tty. On a network connection you have to take
|
||||
* what the system gives you, and if the system gives you two
|
||||
* strings spread over three packets with a CR/LF smack in the
|
||||
* middle, well ... you have to make do. That means returning the
|
||||
* first line and storing the rest of the data for use in a
|
||||
* subsequent call to the data reading facility.
|
||||
*
|
||||
* So, in other words, our get_line function needs to track state
|
||||
* *and* be threadsafe/re-entrant. Declaring a static buffer within
|
||||
* the function would let it track state, but thread safety would be
|
||||
* a problem.
|
||||
*
|
||||
* Fortunately, the C++ functor idiom solves this problem
|
||||
* beautifully.
|
||||
*
|
||||
* Further: naïve blocking I/O, although it works rather well, will
|
||||
* artificially inflate the server load. For this reason the code
|
||||
* uses slightly more complex but still quite manageable poll()-
|
||||
* based I/O with a 750ms timeout. Responsiveness isn't quite as
|
||||
* high as it could be, but it's a small price to pay for better
|
||||
* behavior server-side.
|
||||
*
|
||||
* @author Rob Hansen
|
||||
* @since 0.9*/
|
||||
|
||||
struct SocketIO
|
||||
{
|
||||
public:
|
||||
/** Initializes the object to listen on a particular file
|
||||
* descriptor.
|
||||
*
|
||||
* @param fd File descriptor to read on */
|
||||
SocketIO(int32_t fd) :
|
||||
sock_fd(fd), buffer(""), tmp_buf(65536, '\0') {}
|
||||
|
||||
/** Writes a line of text to the socket. The caller is
|
||||
* responsible for ensuring the text has a '\r\n' appended.
|
||||
*
|
||||
* @param line The line to write
|
||||
* @since 1.1 */
|
||||
void write_line(string line) const
|
||||
{
|
||||
if (-1 == write(sock_fd, line.c_str(), line.size())) {
|
||||
throw UnrecoverableNetworkError();
|
||||
}
|
||||
}
|
||||
|
||||
/** Writes a line of text to the socket. The caller is
|
||||
* responsible for ensuring the text has a '\r\n' appended.
|
||||
*
|
||||
* @param line The line to write
|
||||
* @since 1.1 */
|
||||
void write_line(const char* line) const
|
||||
{
|
||||
write_line(string(line));
|
||||
}
|
||||
|
||||
/** Reads a line from the socket. Returns an auto_ptr<string>
|
||||
* because clients might be sending arbitrarily-sized (i.e.,
|
||||
* really huge) data to us. Passing smartpointers around is
|
||||
* ridiculously faster than copying huge blocks of memory.
|
||||
*
|
||||
* Arguably this should return a shared_ptr<string>, but a lot
|
||||
* of C++ compilers have shaky support for TR1. Instead we use
|
||||
* the lowest common denominator: std::auto_ptr.
|
||||
*
|
||||
* This function replaces the old operator().
|
||||
*
|
||||
* @since 1.1
|
||||
* @return An auto_ptr<string> representing one line read from
|
||||
* the file descriptor.*/
|
||||
auto_ptr<string> read_line()
|
||||
{
|
||||
/* "But in Latin, Jehovah begins with the letter 'I'..."
|
||||
*
|
||||
* SAVE YOURSELF THE NIGHTMARE BUG HUNT. Remember that when
|
||||
* you test this code at the console, tapping return will
|
||||
* enter a \n. When you do it from a Telnet client, it enters
|
||||
* a \r\n. This one-character difference turned into a six-
|
||||
* hour bug hunt. Documented here for posterity. If you ever
|
||||
* wonder why I'm tempted to start drinking before the sun
|
||||
* rises, well, this one's a good example... */
|
||||
|
||||
while (true) {
|
||||
pollfd fds = { sock_fd, POLLIN, 0 };
|
||||
int poll_code(poll(&fds, 1, 750));
|
||||
|
||||
if (-1 == poll_code)
|
||||
throw UnrecoverableNetworkError();
|
||||
|
||||
else if (fds.revents & POLLERR ||
|
||||
fds.revents & POLLHUP)
|
||||
throw UnrecoverableNetworkError();
|
||||
|
||||
else if (fds.revents & POLLIN) {
|
||||
memset(static_cast<void*>(&tmp_buf[0]),
|
||||
0,
|
||||
tmp_buf.size());
|
||||
ssize_t bytes_read = read(sock_fd,
|
||||
static_cast<void*>(&tmp_buf[0]),
|
||||
tmp_buf.size());
|
||||
buffer += string(&tmp_buf[0], &tmp_buf[bytes_read]);
|
||||
|
||||
/* To prevent DoS from clients spamming us with huge
|
||||
packets, bomb on any query larger than 256k. */
|
||||
if (buffer.size() > 262144)
|
||||
throw UnrecoverableNetworkError();
|
||||
|
||||
string::iterator iter = find(buffer.begin(),
|
||||
buffer.end(), '\n');
|
||||
if (iter != buffer.end()) {
|
||||
auto_ptr<string> rv(new string(buffer.begin(), iter));
|
||||
rv->erase(remove(rv->begin(),
|
||||
rv->end(),
|
||||
'\r'),
|
||||
rv->end());
|
||||
rv->erase(remove(rv->begin(),
|
||||
rv->end(),
|
||||
'\n'),
|
||||
rv->end());
|
||||
buffer = string(iter + 1, buffer.end());
|
||||
return rv;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
private:
|
||||
/** Tracks the file descriptor to read */
|
||||
const int32_t sock_fd;
|
||||
/** Internal storage buffer for keeping track of read, but not
|
||||
* yet finished, data */
|
||||
string buffer;
|
||||
/** Internal storage buffer used only briefly, but declared here
|
||||
* in order so that we can avoid repeatedly putting it on the
|
||||
* stack. Additionally, this only takes a few bytes on the stack:
|
||||
* the actual buffer gets allocated on the heap. */
|
||||
vector<char> tmp_buf;
|
||||
};
|
||||
|
||||
/** A hand-rolled string tokenizer in C++.
|
||||
*
|
||||
* Efficient string tokenization in 29 lines, without absurd
|
||||
* contortions of code. Booyah. Given the state of things in C,
|
||||
* where on some platforms strtok is outright obsoleted by strsep
|
||||
* and on other platforms strsep is just a distant promise of what
|
||||
* the future might hold... I'll take this way.
|
||||
*
|
||||
* Returns a smartpointer to a vector for the same reason
|
||||
* SocketIO::read_line() returns one: to spare us the
|
||||
* otherwise absurd amount of memcpying that would be going on.
|
||||
*
|
||||
* @param line A pointer to the line to tokenize
|
||||
* @param character The delimiter character
|
||||
* @returns An auto_ptr to a vector of strings representing tokens */
|
||||
auto_ptr<vector<string> > tokenize(string& line, char character = ' ')
|
||||
{
|
||||
auto_ptr<vector<string> > rv(new vector<string>());
|
||||
transform(line.begin(), line.end(), line.begin(), toupper);
|
||||
|
||||
string::iterator begin(find_if(line.begin(), line.end(),
|
||||
not1(bind2nd(equal_to<char>(),
|
||||
character))));
|
||||
string::iterator end(
|
||||
(begin != line.end())
|
||||
? find(begin + 1, line.end(), character)
|
||||
: line.end()
|
||||
);
|
||||
|
||||
while (begin != line.end()) {
|
||||
rv->push_back(string(begin, end));
|
||||
if (end == line.end()) {
|
||||
begin = line.end();
|
||||
continue;
|
||||
}
|
||||
begin = find_if(end + 1, line.end(),
|
||||
not1(bind2nd(equal_to<char>(), character)));
|
||||
end = (begin != line.end())
|
||||
? find(begin + 1, line.end(), character)
|
||||
: line.end();
|
||||
}
|
||||
return rv;
|
||||
}
|
||||
|
||||
/** A hand-rolled string tokenizer in C++.
|
||||
*
|
||||
* Efficient string tokenization in 29 lines, without absurd
|
||||
* contortions of code. Booyah. Given the state of things in C,
|
||||
* where on some platforms strtok is outright obsoleted by strsep
|
||||
* and on other platforms strsep is just a distant promise of what
|
||||
* the future might hold... I'll take this way.
|
||||
*
|
||||
* Returns a smartpointer to a vector for the same reason
|
||||
* SocketIO::read_line() returns one: to spare us the
|
||||
* otherwise absurd amount of memcpying that would be going on.
|
||||
*
|
||||
* @param line A pointer to the line to tokenize
|
||||
* @param character The delimiter character
|
||||
* @returns An auto_ptr to a vector of strings representing tokens */
|
||||
auto_ptr<vector<string> > tokenize(auto_ptr<string> line, char ch = ' ')
|
||||
{
|
||||
return tokenize(*line, ch);
|
||||
}
|
||||
|
||||
/** Turns a string of 'a.b.c.d', ala dotted-quad style, into a
|
||||
* 32-bit integer. 'a' must be present: if b through d are
|
||||
* omitted, they are assumed to be zero.
|
||||
*
|
||||
* @param line A smartpointer to a version string
|
||||
* @returns A 32-bit integer representing a version, or -1 on
|
||||
* failure.
|
||||
* @author Rob Hansen
|
||||
* @since 0.9 */
|
||||
int32_t parse_version(auto_ptr<string> line)
|
||||
{
|
||||
int32_t version(0);
|
||||
int32_t this_token(0);
|
||||
auto_ptr<vector<string> > tokens(tokenize(line));
|
||||
auto_ptr<vector<string> > version_tokens;
|
||||
size_t index(0);
|
||||
|
||||
if (tokens->size() != 2 or
|
||||
tokens->at(0) != "VERSION:") {
|
||||
goto PARSE_VERSION_BAIL_BAD;
|
||||
}
|
||||
|
||||
version_tokens = tokenize(tokens->at(1), '.');
|
||||
|
||||
if (version_tokens->size() < 1 or version_tokens->size() > 4) {
|
||||
goto PARSE_VERSION_BAIL_BAD;
|
||||
}
|
||||
|
||||
while (version_tokens->size() != 4) {
|
||||
version_tokens->push_back("0");
|
||||
}
|
||||
|
||||
for (index = 0 ; index < 4 ; ++index) {
|
||||
string& thing(version_tokens->at(index));
|
||||
if (thing.end() != find_if(thing.begin(),
|
||||
thing.end(),
|
||||
not1(ptr_fun(::isdigit)))) {
|
||||
goto PARSE_VERSION_BAIL_BAD;
|
||||
}
|
||||
this_token = atoi(thing.c_str());
|
||||
if (this_token < 0 or this_token > 254) {
|
||||
goto PARSE_VERSION_BAIL_BAD;
|
||||
}
|
||||
version = (version << 8) + this_token;
|
||||
}
|
||||
goto PARSE_VERSION_BAIL;
|
||||
|
||||
PARSE_VERSION_BAIL_BAD:
|
||||
version = -1;
|
||||
|
||||
PARSE_VERSION_BAIL:
|
||||
return version;
|
||||
}
|
||||
|
||||
|
||||
/** A simple convenience function that allows us to ensure
|
||||
* we're getting valid hashes.
|
||||
*
|
||||
* @param digest The string being checked
|
||||
* @returns true if it could be an MD5 or SHA-1 digest, false otherwise
|
||||
* @since 0.9
|
||||
* @author Rob Hansen */
|
||||
bool ishexdigest(const string& digest)
|
||||
{
|
||||
string::const_iterator iter(digest.begin());
|
||||
|
||||
if (not (digest.size() == 40 or digest.size() == 32)) {
|
||||
return false;
|
||||
}
|
||||
for ( ; iter != digest.end() ; ++iter) {
|
||||
bool is_number = (*iter >= '0' and *iter <= '9');
|
||||
bool is_letter = (*iter >= 'A' and *iter <= 'F');
|
||||
if (not (is_number or is_letter))
|
||||
return false;
|
||||
}
|
||||
return true;
|
||||
}
|
||||
|
||||
/** Performs a transaction with a client. Adheres to protocol
|
||||
* version 1.0.
|
||||
*
|
||||
* @param sio The socket to listen and respond on
|
||||
* @param ip_addr The IP address of the remote host
|
||||
* @since 0.9 */
|
||||
void handle_protocol_10(SocketIO& sio, const char* ip_addr)
|
||||
{
|
||||
string return_seq("");
|
||||
uint32_t found(0);
|
||||
double frac(0.0);
|
||||
uint32_t total_queries(0);
|
||||
|
||||
try {
|
||||
auto_ptr<vector<string> > commands(tokenize(sio.read_line()));
|
||||
|
||||
if (commands->size() < 2 or commands->at(0) != "QUERY") {
|
||||
sio.write_line("NOT OK\r\n");
|
||||
return;
|
||||
}
|
||||
|
||||
for (size_t index = 1 ; index < commands->size() ; ++index) {
|
||||
if (not ishexdigest(commands->at(index))) {
|
||||
sio.write_line("NOT OK\r\n");
|
||||
return;
|
||||
}
|
||||
if (hashes.end() != hashes.find(commands->at(index))) {
|
||||
return_seq += "1";
|
||||
found += 1;
|
||||
} else {
|
||||
return_seq += "0";
|
||||
}
|
||||
}
|
||||
|
||||
total_queries = commands->size() -
|
||||
(commands->size() > 0 ? 1 : 0);
|
||||
|
||||
if (total_queries) {
|
||||
double numerator(100 * found);
|
||||
double denominator(total_queries);
|
||||
frac = numerator / denominator;
|
||||
}
|
||||
|
||||
syslog(INFO,
|
||||
"%s: protocol 1.0, found %u of %u hashes (%.1f%%), closed normally",
|
||||
ip_addr,
|
||||
found,
|
||||
total_queries,
|
||||
frac);
|
||||
return_seq = "OK " + return_seq + "\r\n";
|
||||
sio.write_line(return_seq);
|
||||
} catch (exception&) {
|
||||
return;
|
||||
}
|
||||
}
|
||||
|
||||
/** Performs a transaction with a client. Adheres to protocol
|
||||
* version 2.0.
|
||||
*
|
||||
* @param sio The socket to listen and respond on
|
||||
* @since 1.1 */
|
||||
void handle_protocol_20(SocketIO& sio, const char* ip_addr)
|
||||
{
|
||||
uint32_t total_queries(0);
|
||||
uint32_t found(0);
|
||||
double frac(0.0);
|
||||
|
||||
try {
|
||||
auto_ptr<vector<string> > commands(tokenize(sio.read_line()));
|
||||
while (commands->size() >= 1) {
|
||||
string return_seq("");
|
||||
|
||||
if ("BYE" == commands->at(0)) {
|
||||
if (total_queries) {
|
||||
double numerator(100 * found);
|
||||
double denominator(total_queries);
|
||||
frac = numerator / denominator;
|
||||
}
|
||||
syslog(INFO,
|
||||
"%s: protocol 2.0, found %u of %u hashes (%.1f%%), closed normally",
|
||||
ip_addr,
|
||||
found,
|
||||
total_queries,
|
||||
frac);
|
||||
return;
|
||||
}
|
||||
|
||||
else if ("DOWNSHIFT" == commands->at(0)) {
|
||||
syslog(INFO,
|
||||
"%s asked for a protocol downgrade to 1.0",
|
||||
ip_addr);
|
||||
sio.write_line("OK\r\n");
|
||||
handle_protocol_10(sio, ip_addr);
|
||||
return;
|
||||
}
|
||||
|
||||
else if ("UPSHIFT" == commands->at(0)) {
|
||||
syslog(INFO,
|
||||
"%s asked for a protocol upgrade (refused)",
|
||||
ip_addr);
|
||||
sio.write_line("NOT OK\r\n");
|
||||
}
|
||||
|
||||
else if ("QUERY" == commands->at(0)) {
|
||||
if (commands->size() == 1) {
|
||||
sio.write_line("NOT OK\r\n");
|
||||
return;
|
||||
} else {
|
||||
size_t index(1);
|
||||
for ( ; index < commands->size() ; ++index) {
|
||||
if (not ishexdigest(commands->at(index))) {
|
||||
sio.write_line("NOT OK\r\n");
|
||||
return;
|
||||
}
|
||||
|
||||
set<string>::const_iterator iter(hashes.begin());
|
||||
iter = hashes.find(commands->at(index));
|
||||
if (iter != hashes.end()) {
|
||||
return_seq += "1";
|
||||
found += 1;
|
||||
} else {
|
||||
return_seq += "0";
|
||||
}
|
||||
}
|
||||
return_seq = "OK " + return_seq + "\r\n";
|
||||
total_queries += commands->size() - 1;
|
||||
}
|
||||
}
|
||||
|
||||
else if ("STATUS" == commands->at(0) and enable_status) {
|
||||
double loadavg[3] = { 0.0, 0.0, 0.0 };
|
||||
char buf[1024];
|
||||
|
||||
getloadavg(loadavg, 3);
|
||||
memset(buf, 0, 1024);
|
||||
snprintf(buf,
|
||||
1024,
|
||||
"OK %u %s hashes, load %.2f %.2f %.2f\r\n",
|
||||
(u_int32_t) hashes.size(),
|
||||
(hashes.begin() == hashes.end()) ? "unknown" :
|
||||
(hashes.begin()->size() == 32 ? "MD5" :
|
||||
hashes.begin()->size() == 40 ? "SHA-1" :
|
||||
hashes.begin()->size() == 64 ? "SHA-256" :
|
||||
"unknown algorithm"),
|
||||
loadavg[0],
|
||||
loadavg[1],
|
||||
loadavg[2]);
|
||||
string line(buf);
|
||||
return_seq = string(buf);
|
||||
syslog(INFO,
|
||||
"%s asked for server status (sent '%s')",
|
||||
ip_addr,
|
||||
buf);
|
||||
} else if ("STATUS" == commands->at(0)) {
|
||||
syslog(INFO,
|
||||
"%s asked for server status (refused)",
|
||||
ip_addr);
|
||||
return_seq = "OK NOT SUPPORTED\r\n";
|
||||
} else {
|
||||
sio.write_line("NOT OK\r\n");
|
||||
return;
|
||||
}
|
||||
sio.write_line(return_seq);
|
||||
commands = tokenize(sio.read_line());
|
||||
}
|
||||
} catch (exception&) {
|
||||
if (total_queries) {
|
||||
double numerator(100 * found);
|
||||
double denominator(total_queries);
|
||||
frac = numerator / denominator;
|
||||
}
|
||||
syslog(INFO,
|
||||
"%s: protocol 2.0, found %u of %u hashes (%.1f%%), closed abnormally",
|
||||
ip_addr,
|
||||
found,
|
||||
total_queries,
|
||||
frac);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/** Handles client query requests.
|
||||
*
|
||||
* @param fd the client's socket file descriptor
|
||||
* @since 0.9 */
|
||||
void handle_client(const int32_t fd, const string& ip_addr)
|
||||
{
|
||||
SocketIO sio(fd);
|
||||
|
||||
try {
|
||||
int32_t version(parse_version(sio.read_line()));
|
||||
if (version > 0 and version <= 0x01000000) {
|
||||
sio.write_line("OK\r\n");
|
||||
handle_protocol_10(sio, ip_addr.c_str());
|
||||
} else if (version > 0x01000000 and
|
||||
version <= 0x02000000 and
|
||||
not only_old) {
|
||||
sio.write_line("OK\r\n");
|
||||
handle_protocol_20(sio, ip_addr.c_str());
|
||||
} else {
|
||||
sio.write_line("NOT OK\r\n");
|
||||
}
|
||||
} catch (exception&) {
|
||||
return;
|
||||
}
|
||||
}
|
||||
19
src/handler.hpp
Normal file
19
src/handler.hpp
Normal file
@@ -0,0 +1,19 @@
|
||||
/* $Id: handler.hpp 108 2012-01-30 19:30:29Z rjh $
|
||||
*
|
||||
* Copyright (c) 2011, Robert J. Hansen <rjh@secret-alchemy.com>
|
||||
*
|
||||
* Permission to use, copy, modify, and/or distribute this software for any
|
||||
* purpose with or without fee is hereby granted, provided that the above
|
||||
* copyright notice and this permission notice appear in all copies.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
|
||||
* WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
|
||||
* MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
|
||||
* ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
|
||||
* WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
|
||||
* ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
|
||||
* OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.*/
|
||||
|
||||
#include <string>
|
||||
|
||||
void handle_client(int32_t, const std::string&);
|
||||
498
src/main.cc
Normal file
498
src/main.cc
Normal file
@@ -0,0 +1,498 @@
|
||||
/* $Id: main.cc 142 2013-02-23 22:25:32Z rjh $
|
||||
*
|
||||
* Copyright (c) 2011-2012, Robert J. Hansen <rjh@secret-alchemy.com>
|
||||
*
|
||||
* Permission to use, copy, modify, and/or distribute this software for any
|
||||
* purpose with or without fee is hereby granted, provided that the above
|
||||
* copyright notice and this permission notice appear in all copies.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
|
||||
* WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
|
||||
* MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
|
||||
* ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
|
||||
* WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
|
||||
* ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
|
||||
* OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
|
||||
*
|
||||
* Code standards:
|
||||
* This is a small enough project we don't need a formal coding standard.
|
||||
* That said, here are some helpful tips for people who want to submit
|
||||
* patches:
|
||||
*
|
||||
* - If it's not 100% ISO C++98, it won't get in.
|
||||
* - It must compile cleanly and without warnings under both GNU G++
|
||||
* and Clang++, even with "-W -Wextra -ansi -pedantic".
|
||||
* - C++ offers 'and', 'or' and 'not' keywords instead of &&, || and !.
|
||||
* I like these: I think they're more readable. Please use them.
|
||||
* - C++ allows you to initialize variables at declaration time by
|
||||
* doing something like "int x(3)" instead of "int x = 3". Please
|
||||
* do this where practical: it's a good habit to get into for C++.
|
||||
* - Please try to follow the formatting conventions. It's mostly
|
||||
* straight-up astyle format, with occasional tweaks where necessary
|
||||
* to get nice hardcopy printouts.
|
||||
* - If you write a new function it must have a Doxygen block
|
||||
* documenting it.
|
||||
*
|
||||
* Contributor history:
|
||||
* Robert J. Hansen <rjh@secret-alchemy.com>
|
||||
* - everything
|
||||
*/
|
||||
|
||||
#include <sys/stat.h>
|
||||
#include <syslog.h>
|
||||
#include <set>
|
||||
#include <string>
|
||||
#include <time.h>
|
||||
#include <arpa/inet.h>
|
||||
#include <pthread.h>
|
||||
#include <algorithm>
|
||||
#include <limits.h>
|
||||
#include "handler.hpp"
|
||||
#include <iostream>
|
||||
#include <fstream>
|
||||
#include <vector>
|
||||
#include <memory>
|
||||
|
||||
/* Additional defines necessary on Linux: */
|
||||
#ifdef __linux__
|
||||
#include <cstring> // for memset
|
||||
#include <cstdio> // for stderr
|
||||
#include <unistd.h> // for close, fork, chdir (Fedora only)
|
||||
#endif
|
||||
|
||||
/* Additional defines necessary on FreeBSD: */
|
||||
/* Necessary for sockaddr and sockaddr_in structures */
|
||||
#ifdef __FreeBSD__
|
||||
#include <sys/socket.h>
|
||||
#include <netinet/in.h>
|
||||
#endif
|
||||
|
||||
using std::string;
|
||||
using std::set;
|
||||
using std::transform;
|
||||
using std::find_if;
|
||||
using std::not1;
|
||||
using std::ptr_fun;
|
||||
using std::ifstream;
|
||||
using std::cerr;
|
||||
using std::vector;
|
||||
using std::remove_if;
|
||||
|
||||
#define INFO LOG_MAKEPRI(LOG_USER, LOG_INFO)
|
||||
#define WARN LOG_MAKEPRI(LOG_USER, LOG_WARNING)
|
||||
#define DEBUG LOG_MAKEPRI(LOG_USER, LOG_DEBUG)
|
||||
#define MAX_PENDING_REQUESTS 20
|
||||
#define BUFFER_SIZE 8192
|
||||
|
||||
namespace {
|
||||
|
||||
/** Tracks whether the server should only support protocol 1.0. */
|
||||
bool old_only(false);
|
||||
|
||||
/** Tracks whether the server should support status queries. */
|
||||
bool status_enabled(false);
|
||||
|
||||
/** Tracks whether the server should run as a daemon. */
|
||||
bool standalone(false);
|
||||
|
||||
/** Our set of hashes, represented as a set of strings. Note
|
||||
* that the current NSRL library contains approximately 32
|
||||
* million values, each at roughly 64 bytes (rounded to binary
|
||||
* powers to make the math easier). This is 2**25 values times
|
||||
* 2**6 bytes each = 2**31 bytes, or about two gigs of RAM.
|
||||
*
|
||||
* Moral of the story: populating this set is computationally
|
||||
* expensive. */
|
||||
set<string> hash_set;
|
||||
|
||||
/** Tracks where we look for the location of the
|
||||
* reference data set. */
|
||||
string RDS_LOC(PKGDATADIR "/NSRLFile.txt");
|
||||
|
||||
/** Keeps track of the last time we serviced a request.
|
||||
* This is locked via the active_sessions_mutex mutex.*/
|
||||
time_t last_req_at(time(0));
|
||||
|
||||
/** Keeps track of how many clients are currently being serviced.
|
||||
* This is locked via the active_sessions_mutex mutex. */
|
||||
int32_t active_sessions(0);
|
||||
|
||||
/** A mutex to keep various threads from clobbering each other
|
||||
* in their fanatical zeal to update shared resources.
|
||||
*
|
||||
* Interestingly, PTHREAD_MUTEX_INITIALIZER is so complex that
|
||||
* it cannot be used in a C++ initializer: you have to use old
|
||||
* C-style equals-operator initialization. */
|
||||
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
|
||||
|
||||
/** The server's inactivity timeout interval */
|
||||
int32_t TIMEOUT(INT_MAX);
|
||||
|
||||
/** Which port to listen on */
|
||||
uint16_t PORT(9120);
|
||||
|
||||
/** A convenience class allowing us to pass multiple pieces of
|
||||
data with a void*. */
|
||||
struct clientinfo {
|
||||
clientinfo(int32_t sfd, const char* ipaddr) :
|
||||
sock_fd(sfd), ip_address(ipaddr) {}
|
||||
int32_t sock_fd;
|
||||
string ip_address;
|
||||
};
|
||||
|
||||
|
||||
/** Determines whether a character represents a valid uppercase
|
||||
* hexadecimal digit. */
|
||||
|
||||
bool is_hexit(char ch)
|
||||
{
|
||||
return (ch >= '0' and ch <= '9') or (ch >= 'A' and ch <= 'F');
|
||||
}
|
||||
|
||||
/** Loads hashes from disk and stores them in a fast-accessing
|
||||
* in-memory data structure. This will be slow. */
|
||||
void load_hashes()
|
||||
{
|
||||
vector<char> buf(BUFFER_SIZE);
|
||||
ifstream infile(RDS_LOC.c_str());
|
||||
|
||||
if (not infile.good()) {
|
||||
syslog(WARN, "couldn't open hashes file %s",
|
||||
RDS_LOC.c_str());
|
||||
exit(EXIT_FAILURE);
|
||||
}
|
||||
|
||||
while (infile) {
|
||||
// Per the C++ spec, &vector<T>[loc] is guaranteed
|
||||
// to be a T*. (Unless it's a vector<bool>, in which
|
||||
// case you're living in such sin there's absolutely
|
||||
// no help for you. Friends don't let friends use
|
||||
// vector<bool>.)
|
||||
memset(static_cast<void*>(&buf[0]), 0, BUFFER_SIZE);
|
||||
infile.getline(&buf[0], BUFFER_SIZE);
|
||||
string line(buf.begin(), buf.end());
|
||||
string::iterator iter(line.begin());
|
||||
string token("");
|
||||
|
||||
while (iter != line.end()) {
|
||||
string::iterator end(find(iter, line.end(), ','));
|
||||
token = string(iter, end);
|
||||
transform(token.begin(), token.end(), token.begin(), ::toupper);
|
||||
token.erase(remove_if(token.begin(),
|
||||
token.end(),
|
||||
not1(ptr_fun(is_hexit))),
|
||||
token.end());
|
||||
if (32 == token.size() || 40 == token.size() || 64 == token.size()) {
|
||||
break;
|
||||
}
|
||||
iter = (end == line.end() ? line.end() : end + 1);
|
||||
}
|
||||
|
||||
if (32 != token.size() && 40 != token.size() && 64 != token.size()) {
|
||||
continue;
|
||||
}
|
||||
if (hash_set.size() > 0 and hash_set.size() % 1000000 == 0) {
|
||||
syslog(INFO, "%lu million hashes read", hash_set.size() / 1000000);
|
||||
}
|
||||
hash_set.insert(token);
|
||||
}
|
||||
infile.close();
|
||||
syslog(INFO, "read in %u unique hashes",
|
||||
static_cast<uint32_t>(hash_set.size()));
|
||||
}
|
||||
|
||||
/** A thin wrapper around handler.cc and handle_client, meant
|
||||
* to ensure the programmer of that function doesn't have to
|
||||
* worry about thread contention. */
|
||||
void* run_client_thread(void* arg)
|
||||
{
|
||||
clientinfo* ci(static_cast<clientinfo*>(arg));
|
||||
const int32_t sock_fd(ci->sock_fd);
|
||||
const string ip_address(ci->ip_address);
|
||||
|
||||
// Delete the dynamically-allocated memory block. This
|
||||
// is an inevitable line of execution after successfully
|
||||
// allocating the block in the main loop (below).
|
||||
delete ci;
|
||||
|
||||
if (0 != pthread_mutex_lock(&mutex)) {
|
||||
syslog(WARN, "couldn't acquire the mutex!");
|
||||
close(sock_fd);
|
||||
exit(EXIT_FAILURE);
|
||||
}
|
||||
last_req_at = time(0);
|
||||
active_sessions += 1;
|
||||
if (0 != pthread_mutex_unlock(&mutex)) {
|
||||
syslog(WARN, "couldn't release the mutex!");
|
||||
close(sock_fd);
|
||||
exit(EXIT_FAILURE);
|
||||
}
|
||||
|
||||
syslog(INFO, "connection from %s", ip_address.c_str());
|
||||
|
||||
handle_client(sock_fd, ip_address);
|
||||
|
||||
close(sock_fd);
|
||||
|
||||
syslog(INFO, "disconnected from %s", ip_address.c_str());
|
||||
|
||||
if (0 != pthread_mutex_lock(&mutex)) {
|
||||
syslog(WARN, "couldn't acquire the mutex!");
|
||||
exit(-1);
|
||||
}
|
||||
active_sessions -= 1;
|
||||
if (0 != pthread_mutex_unlock(&mutex)) {
|
||||
syslog(WARN, "couldn't release the mutex!");
|
||||
exit(EXIT_FAILURE);
|
||||
}
|
||||
return NULL;
|
||||
}
|
||||
|
||||
|
||||
/** Converts our application into a proper daemon. */
|
||||
void daemonize()
|
||||
{
|
||||
const pid_t pid(fork());
|
||||
if (pid < 0) {
|
||||
syslog(WARN, "couldn't fork!");
|
||||
exit(EXIT_FAILURE);
|
||||
} else if (pid > 0) {
|
||||
exit(EXIT_SUCCESS);
|
||||
}
|
||||
|
||||
syslog(INFO, "daemon started");
|
||||
umask(0);
|
||||
|
||||
if (setsid() < 0) {
|
||||
syslog(WARN, "couldn't set sid");
|
||||
exit(EXIT_FAILURE);
|
||||
}
|
||||
// Technically, the root directory is the only one guaranteed
|
||||
// to exist on the filesystem. Therefore, it's the only safe
|
||||
// directory to point our daemon at. I doubt this is strictly
|
||||
// necessary, but remembering to completely rebase a daemon is
|
||||
// part of just good hacking etiquette.
|
||||
if (0 > chdir("/")) {
|
||||
syslog(WARN, "couldn't chdir to root");
|
||||
exit(EXIT_FAILURE);
|
||||
}
|
||||
// No extraneous filehandles for us. Daemons lack stdio, so
|
||||
// shut 'em on down.
|
||||
close(STDIN_FILENO);
|
||||
close(STDOUT_FILENO);
|
||||
close(STDERR_FILENO);
|
||||
}
|
||||
|
||||
|
||||
/** Creates a server socket that will listen for clients. */
|
||||
int32_t make_socket()
|
||||
{
|
||||
int32_t sock;
|
||||
sockaddr_in server;
|
||||
|
||||
memset(static_cast<void*>(&server), 0, sizeof(server));
|
||||
server.sin_family = AF_INET;
|
||||
server.sin_addr.s_addr = htonl(INADDR_ANY);
|
||||
server.sin_port = htons(PORT);
|
||||
|
||||
if (0 > (sock = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP))) {
|
||||
syslog(WARN, "couldn't create a server socket");
|
||||
exit(EXIT_FAILURE);
|
||||
}
|
||||
if (0 > bind(sock, reinterpret_cast<sockaddr*>(&server),
|
||||
sizeof(server))) {
|
||||
syslog(WARN, "couldn't bind to port 9120");
|
||||
exit(EXIT_FAILURE);
|
||||
}
|
||||
if (0 > listen(sock, MAX_PENDING_REQUESTS)) {
|
||||
syslog(WARN, "couldn't listen for clients");
|
||||
exit(EXIT_FAILURE);
|
||||
}
|
||||
syslog(INFO, "ready for clients");
|
||||
return sock;
|
||||
}
|
||||
|
||||
/** A thread that runs every thirty seconds checking to see if the
|
||||
* daemon should politely exit. It will automatically shut down
|
||||
* if no clients are currently being serviced and more than
|
||||
* AUTOSHUTDOWN seconds have elapsed since the time the last client
|
||||
* connected. */
|
||||
void* shutdown_handler(void*)
|
||||
{
|
||||
while (1) {
|
||||
if (0 != pthread_mutex_lock(&mutex)) {
|
||||
syslog(WARN, "shutdown handler couldn't get mutex");
|
||||
exit(EXIT_FAILURE);
|
||||
}
|
||||
if (0 == active_sessions &&
|
||||
(TIMEOUT < (time(0) - last_req_at))) {
|
||||
syslog(INFO, "exiting normally due to inactivity");
|
||||
exit(EXIT_SUCCESS);
|
||||
}
|
||||
if (0 != pthread_mutex_unlock(&mutex)) {
|
||||
syslog(WARN, "shutdown handler couldn't release mutex");
|
||||
exit(EXIT_FAILURE);
|
||||
}
|
||||
sleep(30);
|
||||
}
|
||||
return NULL;
|
||||
}
|
||||
|
||||
/** Checks a string to see if it's a valid base-10 number. */
|
||||
int32_t is_num(const string& num)
|
||||
{
|
||||
string::const_iterator b(num.begin());
|
||||
string::const_iterator e(num.end());
|
||||
|
||||
return (e == find_if(b, e, not1(ptr_fun(::isdigit))))
|
||||
? ::atoi(num.c_str())
|
||||
: -1;
|
||||
}
|
||||
|
||||
/** Checks a string to see whether it's a port in the range
|
||||
* (1024, 65535) inclusive (i.e., in userspace). */
|
||||
bool validate_port(const string& foo)
|
||||
{
|
||||
PORT = is_num(foo) & 0xFFFF;
|
||||
return (PORT >= 1024);
|
||||
}
|
||||
|
||||
bool validate_timeout(const string& foo)
|
||||
{
|
||||
int32_t timeout(is_num(foo));
|
||||
if (0 == timeout) {
|
||||
timeout = INT_MAX;
|
||||
}
|
||||
else if (0 < timeout) {
|
||||
TIMEOUT = timeout;
|
||||
}
|
||||
return (0 < timeout);
|
||||
}
|
||||
|
||||
void show_usage(const char* program_name)
|
||||
{
|
||||
cerr <<
|
||||
"Usage: " << program_name << " [-vbhsSo -f FILE -p PORT -t TIMEOUT]\n\n" <<
|
||||
"-v : print version information\n" <<
|
||||
"-b : get information on reporting bugs\n" <<
|
||||
"-f : specify an alternate RDS (default: "<< PKGDATADIR <<
|
||||
"/NSRLFile.txt)\n" <<
|
||||
"-s : allow clients to query server status (default: disabled)\n" <<
|
||||
"-S : run as a normal process (do not run as a daemon)\n" <<
|
||||
"-o : only support old (1.0) nsrlsvr protocol\n" <<
|
||||
"-h : show this help message\n" <<
|
||||
"-p : listen on PORT, between 1024 and 65535 (default: 9120)\n" <<
|
||||
"-t : stop after TIMEOUT seconds of inactivity (default: disabled)\n\n";
|
||||
exit(EXIT_FAILURE);
|
||||
}
|
||||
}
|
||||
|
||||
/** An externally available const reference to the hash set. */
|
||||
const set<string>& hashes(hash_set);
|
||||
|
||||
/** An externally available const reference to the variable storing
|
||||
* whether or not status checking should be enabled. */
|
||||
const bool& enable_status(status_enabled);
|
||||
|
||||
/** An externally available const reference to the variable storing
|
||||
* whether or not only protocol 1.0 should be supported. */
|
||||
const bool& only_old(old_only);
|
||||
|
||||
/** magic happens here */
|
||||
int main(int argc, char* argv[])
|
||||
{
|
||||
int32_t svr_sock(0);
|
||||
int32_t client_sock(0);
|
||||
sockaddr_in client;
|
||||
uint32_t client_length(0);
|
||||
pthread_t shutdown_handler_id;
|
||||
string port_num("9120");
|
||||
string timeout("0");
|
||||
std::auto_ptr<ifstream> infile;
|
||||
int32_t opt(0);
|
||||
|
||||
while (-1 != (opt = getopt(argc, argv, "bsvof:hp:t:S"))) {
|
||||
switch (opt) {
|
||||
case 'v':
|
||||
cerr << argv[0] << " " << PACKAGE_VERSION << "\n\n";
|
||||
exit(0);
|
||||
break;
|
||||
case 'b':
|
||||
cerr << argv[0] << " " << PACKAGE_VERSION
|
||||
<< "\n" << PACKAGE_URL << "\n" <<
|
||||
"Praise, blame and bug reports to " << PACKAGE_BUGREPORT << ".\n\n" <<
|
||||
"Please be sure to include your operating system, version of your\n" <<
|
||||
"operating system, and a detailed description of how to recreate\n" <<
|
||||
"your bug.\n\n";
|
||||
exit(0);
|
||||
break;
|
||||
case 'f':
|
||||
RDS_LOC = string((const char*) optarg);
|
||||
infile = std::auto_ptr<ifstream>(new ifstream(RDS_LOC.c_str()));
|
||||
if (not infile->good()) {
|
||||
cerr <<
|
||||
"Error: the specified dataset file could not be found.\n\n";
|
||||
exit(EXIT_FAILURE);
|
||||
}
|
||||
// No explicit close: the auto_ptr will take care of that
|
||||
// on object destruction.
|
||||
break;
|
||||
case 'h':
|
||||
show_usage(argv[0]);
|
||||
break;
|
||||
case 'p':
|
||||
port_num = string(optarg);
|
||||
break;
|
||||
case 't':
|
||||
timeout = string(optarg);
|
||||
break;
|
||||
case 's':
|
||||
status_enabled = true;
|
||||
break;
|
||||
case 'S':
|
||||
standalone = true;
|
||||
break;
|
||||
case 'o':
|
||||
old_only = true;
|
||||
break;
|
||||
default:
|
||||
show_usage(argv[0]);
|
||||
exit(EXIT_FAILURE);
|
||||
}
|
||||
}
|
||||
|
||||
if (not (validate_port(port_num) and validate_timeout(timeout))) {
|
||||
show_usage(argv[0]);
|
||||
exit(EXIT_FAILURE);
|
||||
}
|
||||
|
||||
if (not standalone)
|
||||
daemonize();
|
||||
|
||||
load_hashes();
|
||||
svr_sock = make_socket();
|
||||
|
||||
pthread_create(&shutdown_handler_id, NULL, shutdown_handler, NULL);
|
||||
|
||||
while (true) {
|
||||
client_length = sizeof(client);
|
||||
if (0 > (client_sock = accept(svr_sock,
|
||||
reinterpret_cast<sockaddr*>(&client),
|
||||
&client_length))) {
|
||||
syslog(WARN, "dropped a connection");
|
||||
} else {
|
||||
try {
|
||||
pthread_t thread_id;
|
||||
const char* ipaddr(inet_ntoa(client.sin_addr));
|
||||
clientinfo* data(new clientinfo(client_sock, ipaddr));
|
||||
pthread_create(&thread_id, NULL, run_client_thread, data);
|
||||
} catch (std::bad_alloc&) {
|
||||
// There's no reason to have the server fall over:
|
||||
// the sysadmin might be able to kill off whatever
|
||||
// errant process is taking up all the RAM.
|
||||
syslog(WARN, "Critically short of available RAM!");
|
||||
continue;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
Reference in New Issue
Block a user