NetOsPgm: socket more detail ( tags: protocol field,

To create a stream socket in the Internet domain, you could use the following call:

int socket( domain , type , protocol);

Domain: It specifies the communication domain. It takes one of the predefined values described under the protocol family and address family above in this lecture.
Type: It specifies the semantics of communication , or the type of service that is desired . It takes the following values:
- SOCK_STREAM : Stream Socket
- SOCK_DGRAM : Datagram Socket
- SOCK_RAW : Raw Socket
- SOCK_SEQPACKET : Sequenced Packet Socket
- SOCK_RDM : Reliably Delivered Message Packet
Protocol: This parameter identifies the protocol the socket is supposed to use . Some values are as follows:
- IPPROTO_TCP : For TCP (SOCK_STREAM)
- IPPROTO_UDP : For UDP (SOCK_DRAM)
Since we have only one protocol for each kind of socket, it does not matter if we do not define any protocol at all. So for simplicity, we can put "0" (zero) in the protocol field
stream socket/data gram socket vs raw socket:

* A raw socket is used to receive raw packets. This means packets received at the Ethernet layer will directly pass to the raw socket. Stating it precisely, a raw socket bypasses the normal TCP/IP processing and sends the packets to the specific user application (see Figure 1)

* Other sockets like stream sockets and data gram sockets receive data from the transport layer that contains no headers but only the payload. This means that there is no information about the source IP address and MAC address. If applications running on the same machine or on different machines are communicating, then they are only exchanging data.

* The purpose of a raw socket is absolutely different. A raw socket allows an application to directly access lower level protocols, which means a raw socket receives un-extracted packets (see Figure 2). There is no need to provide the port and IP address to a raw socket, unlike in the case of stream and datagram sockets.

Stream Sockets − Delivery in a networked environment is guaranteed. If you send through the stream socket three items "A, B, C", they will arrive in the same order − "A, B, C". These sockets use TCP (Transmission Control Protocol) for data transmission. If delivery is impossible, the sender receives an error indicator. Data records do not have any boundaries.
Datagram Sockets − Delivery in a networked environment is not guaranteed. They're connectionless because you don't need to have an open connection as in Stream Sockets − you build a packet with the destination information and send it out. They use UDP (User Datagram Protocol).

source: http://opensourceforu.com/2015/03/a-guide-to-using-raw-sockets/

http://www.tutorialspoint.com/unix_sockets/what_is_socket.htm

Example:

 s = socket(AF_INET, SOCK_STREAM, 0);

This call would result in a stream socket being created with the TCP protocol providing the underlying communication support.

If the protocol argument to the socket() call is 0, socket() will select a default protocol to use with the returned socket of the type requested. The default protocol is usually correct; alternate choices aren't usually available.

However, when you're using ``raw'' sockets to communicate directly with lower-level protocols or hardware interfaces, the protocol argument may be important for setting up demultiplexing.

For example, raw sockets in the Internet family may be used to implement a new protocol above IP. The socket will receive packets only for the specified protocol. To obtain a particular protocol, you determine the protocol number defined within the communication domain, using the getprotobyname() function, for example:

#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netdb.h>
...
pp = getprotobyname("newtcp");
s = socket(AF_INET, SOCK_STREAM, pp->p_proto);

This would result in a socket s that uses a stream-based connection, but with protocol type of newtcp instead of the default tcp.

Data transfer ( IN TCP)

With a connection established, data may begin to flow. To send and receive data, you can choose from several calls.
If the peer entity at each end of a connection is anchored, you can send or receive a message without specifying the peer. In this case, you can use the normal read() and write() functions:

write(s, buf, sizeof (buf));
read(s, buf, sizeof (buf));

In addition to read() and write(), you can use the new recv() and send() calls:

send(s, buf, sizeof (buf), flags);
recv(s, buf, sizeof (buf), flags);

Although recv() and send() are virtually identical to read() and write(), the extra flags argument is important (the flag values are defined in <sys/socket.h>). One or more of the following flags may be specified:

MSG_OOB: Send/receive out-of-band data.
MSG_PEEK: Look at data without reading.
MSG_DONTROUTE: Send data without routing packets

DATA TRANSFER (in UDP)

To send data, you use the sendto() function:

sendto(s, buf, buflen, flags, (struct sockaddr *)&to, tolen);

The s, buf, buflen, and flags parameters are used as before. The to and tolen values indicate the address of the intended recipient of the message.

To receive messages on an unconnected datagram socket, you use the recvfrom() function:

Out-of-band data is a notion specific to stream sockets; we won't immediately consider it here. The option to have data sent without routing applied to the outgoing packets is currently used only by the routing-table management process and is unlikely to be of interest to the casual user.

On the other hand, the ability to preview data can be quite useful. When MSG_PEEK is specified with a recv() call, any data present is returned, but treated as still unread. That is, the next read() or recv() call applied to the socket will return the data previously viewed.

recvfrom( s, buf, buflen, flags, 
          (struct sockaddr *)&from, &fromlen );

Once again, fromlen is a value-result parameter, initially containing the size of the from buffer, and modified on return to indicate the actual size of the address that the datagram was received from.

Purpose of setsockoption:

Multiple binds to same local port

With certain applications, the algorithm used by the Socket Manager to select port numbers may be unsuitable. For example, the Internet file transfer protocol, FTP, specifies that data connections must always originate from the same local port (i.e. local from the server's point of view).

A server (e.g. ftpd) avoids duplicate associations because the initiating programs (e.g. ftp) use different remote ports (i.e. remote from the server's point of view), even though the server is accessed from the same local port (i.e. local from the server's point of view). In this situation, the Socket Manager would typically disallow the server's binding the same local address and port number if a previous data connection's socket still existed on that port. (This would be a bad thing for servers such as ftpd, which always want to listen to the same well-known local port).

To override the default port selection algorithm, an option call must be performed prior to address binding:

...

int on = 1;
...
setsockopt(s, SOL_SOCKET, SO_REUSEADDR, &on, sizeof(on));
bind(s, (struct sockaddr *) &sin, sizeof (sin));

With the above call, local addresses already in use may be bound. This doesn't violate the uniqueness requirement, because the system still checks at connect time to be sure any other sockets with the same local address and port don't have the same remote address and port. If the association already exists, the error EADDRINUSE is returned.

Broadcasting and determining network configuration

By using a datagram socket, you can send broadcast packets on many networks supported by the system. The network itself must support broadcasting - the system provides no broadcast simulation in software.
Broadcast messages can place a high load on a network since they force every host on the network to service them. Consequently, the ability to send broadcast packets has been limited to sockets explicitly marked as allowing broadcasting. Broadcasting is typically used for one of two reasons:

to find a resource on a local network without prior knowledge of its address
for functions such as routing that require information to be sent to all accessible neighbors

To send a broadcast message, a datagram socket should be created:

s = socket(AF_INET, SOCK_DGRAM, 0);

The socket is marked as allowing broadcasting:

int on = 1;

setsockopt(s, SOL_SOCKET, SO_BROADCAST, &on, sizeof (on));

and at least a port number should be bound to the socket:

sin.sin_family = AF_INET;
sin.sin_addr.s_addr = htonl(INADDR_ANY);
sin.sin_port = htons(MYPORT);
bind(s, (struct sockaddr *) &sin, sizeof (sin));

how to handle multiple sockets at a same time(blocking and non blocking and synchrnous i/o and asynchrnous i/o)

Multiple Sockets

Suppose we have a process which has to handle multiple sockets. We cannot simply read from one of them if a request comes, because that will block while waiting on the request on that particular socket. In the meantime a request may come on any other socket. To handle this input/output multiplexing we could use different techniques :

Busy waiting: In this methodology we make all the operations on sockets non-blocking and handle them simultaneously by doing polling. For example, we could use the read() system call this way and read from all the sockets together. The disadvantage in this is that we waste a lot of CPU cycles. To make the system calls non-blocking we use: fcntl (s, f_setfl, fndelay);
Asynchronous I/O: Here we ask the Operating System to tell us whenever we are waiting for I/O on some sockets. The Operating System sends a signal whenever there is some I/O. When we receive a signal, we will have to check all sockets and then wait till the next signal comes. But there are two problems - first, the signals are expensive to catch and second, we would not be able to know if an input comes on a socket when we are doing I/O on another one. For Asynchronous I/O, we have a different set of commands (here we give the ones for UNIX with a VHD variant): signal(sigio, io_handler); fcntl(s, f_setown, getpid()); fcntl(s, f_setfl, fasync);
Separate process for each I/O: We could as well fork out 10 different child processes for 10 different sockets. These child processes are very light weight and have some communication between them. Now these processes waiting on each socket can have blocking system calls. This wastes a lot of memory, data structures and other resources.
Select() system call: We can use the select system call to instruct the Operating System to wait for any one of multiple events to occur and to wake up the process only if one of these events occur. This way we would know that the I/O request has come from which socket. int select(int nfds, fd_set *readfds, fd_set *writefds, fd_set *errorfds, struct timeval *timeout); void FD_CLR(int fd, fd_set *fdset); int FD_ISSET(int fd, fd_set *fdset); void FD_SET(int fd, fd_set *fdset); void FD_ZERO(fd_set *fdset);
The select() function indicates which of the specified file descriptors is ready for reading, ready for writing, or has an error condition pending. If the specified condition is false for all of the specified file descriptors, select() blocks up to the specified timeout interval, until the specified condition is true for at least one of the specified file descriptors. The nfds argument specifies the range of file descriptors to be tested. The select() function tests file descriptors in the range of 0 to nfds-1. readfds, writefds and errorfds arguments point to an object of type fd_set. readfds specifies the file descriptors to be checked for being ready to read. writefds specifies the file descriptors to be checked for being ready to write, errorfds specifies the file descriptors to be checked for error conditions pending.
On successful completion, the objects pointed to by the readfds, writefds, and errorfds arguments are modified to indicate which file descriptors are ready for reading, ready for writing, or have an error condition pending, respectively. For each file descriptor less than nfds, the corresponding bit will be set on successful completion if it was set on input and the associated condition is true for that file descriptor. The timeout is an upper bound on the amount of time elapsed before select returns. It may be zero, causing select to return immediately. If the timeout is a null pointer, select() blocks until an event causes one of the masks to be returned with a valid (non-zero) value. If the time limit expires before any event occurs that would cause one of the masks to be set to a non-zero value, select() completes successfully and returns 0.

synchronous and asynchnornous ( blocking and non blocking):

Synchronous or Asynchronous?

* practical example of blocking and unblocking usnig web browser:

http://www.scottklement.com/rpg/socktut/nonblocking.html

https://www.ibm.com/support/knowledgecenter/en/SSLTBW_2.1.0/com.ibm.zos.v2r1.hala001/orgblockasyn.htm

There are many types of sockets. Two of them are blocking and nonblocking. Blocking sockets are the
ones that get blocked (no line of code executes after this) after making a system call until a reply comes or timeout or some kind of error occurs. On the other
hand, nonblocking continue the execution after making a system call and do not wait for reply.


Let's say that you're writing a web browser. You try to connect to a web server, but 

the server isn't responding. When a user presses (or clicks) a stop button, you want

 the connect() API to stop trying to connect.


With what you've learned so far, that can't be done. When you issue a call to 

connect(), your program doesn't regain control until either the connection is made,

 or an error occurs.


The solution to this problem is called "non-blocking sockets". By default, TCP

 sockets are in "blocking" mode. For example, when you call recv() to read from a 

stream, control isn't returned to your program until at least one byte of data is

 read from the remote site. This process of waiting for data to appear is referred 

to as "blocking". The same is true for the write() API, the connect() API, etc.

 When you run them, the connection "blocks" until the operation is complete.


Its possible to set a descriptor so that it is placed in "non-blocking" mode. 

When placed in non-blocking mode, you never wait for an operation to complete. 

This is an invaluable tool if you need to switch between many different connected

 sockets, and want to ensure that none of them cause the program to "lock up."

Network communication (or file system access in general) system calls may operate in two modes: synchronous or asynchronous. In the synchronous mode, socket routines return only when the operation is complete. For example, accept returns only when a connection arrives. In the asynchronous mode, socket routines return immediately: system calls become non-blocking calls (e.g., read does not block, waiting until data arrives). You can change the mode with the fcntl system call. For example,

fcntl(s, F_SETFF, FNDELAY);

sets the socket s to operate in asynchronous mode
note: can see detailed usage of fcntl in this programming example:
www.lowtek.com/sockets/select.html

Table 1. Socket programming interface actions
Call type	Socket state	blocking	Nonblocking
Types of read() calls	Input is available	Immediate return	Immediate return
Types of read() calls	No input is available	Wait for input	Immediate return with EWOULDBLOCK error number (select() exception: READ)
Types of write() calls	Output buffers available	Immediate return	Immediate return
Types of write() calls	No output buffers available	Wait for output buffers	Immediate return with EWOULDBLOCK error number (select() exception: WRITE)
accept() call	New connection	Immediate return	Immediate return
accept() call	No connections queued	Wait for new connection	Immediate return with EWOULDBLOCK error number (select() exception: READ)
connect() call		Wait	Immediate return with EINPROGRESS error number (select() exception: WRITE)

* select is completely unblocking:

When you use select() call logic, you do not issue any socket call on a given socket

 until the select() call tells you that something has happened on that socket;

 for example, data has arrived and is ready to be read by a read() call.

 By using the select() call, you do not issue a blocking call until you know that

the call cannot block.

reference:
http://users.pja.edu.pl/~jms/qnx/help/tcpip_4.25_en/prog_guide/sock_advanced_tut.html
http://cse.iitk.ac.in/users/dheeraj/cs425/lec20.html
http://www.cs.rutgers.edu/~pxk/rutgers/notes/sockets/

https://www.quora.com/In-networking-programming-what-is-nonblocking-socket

http://www.ibm.com/developerworks/aix/library/au-tcpsystemcalls/
http://sock-raw.org/papers/sock_raw (socket very detail and worth reading)
# /etc/protocols:
# $Id: protocols,v 1.11 2011/05/03 14:45:40 ovasik Exp $
#
# Internet (IP) protocols
#
#   from: @(#)protocols   5.1 (Berkeley) 4/17/89
#
# Updated for NetBSD based on RFC 1340, Assigned Numbers (July 1992).
# Last IANA update included dated 2011-05-03
#
# See also http://www.iana.org/assignments/protocol-numbers

ip   0   IP       # internet protocol, pseudo protocol number
hopopt   0   HOPOPT       # hop-by-hop options for ipv6
icmp   1   ICMP       # internet control message protocol
igmp   2   IGMP       # internet group management protocol
ggp   3   GGP       # gateway-gateway protocol
ipv4   4   IPv4       # IPv4 encapsulation
st   5   ST       # ST datagram mode
tcp   6   TCP       # transmission control protocol
cbt   7   CBT       # CBT, Tony Ballardie <A.Ballardie@cs.ucl.ac.uk>
egp   8   EGP       # exterior gateway protocol
igp   9   IGP       # any private interior gateway (Cisco: for IGRP)
bbn-rcc   10   BBN-RCC-MON       # BBN RCC Monitoring
nvp   11   NVP-II       # Network Voice Protocol
pup   12   PUP       # PARC universal packet protocol
argus   13   ARGUS       # ARGUS
emcon   14   EMCON       # EMCON
xnet   15   XNET       # Cross Net Debugger
chaos   16   CHAOS       # Chaos
udp   17   UDP       # user datagram protocol
mux   18   MUX       # Multiplexing protocol
dcn   19   DCN-MEAS       # DCN Measurement Subsystems
hmp   20   HMP       # host monitoring protocol
prm   21   PRM       # packet radio measurement protocol
xns-idp   22   XNS-IDP       # Xerox NS IDP
trunk-1   23   TRUNK-1       # Trunk-1
trunk-2   24   TRUNK-2       # Trunk-2
leaf-1   25   LEAF-1       # Leaf-1
leaf-2   26   LEAF-2       # Leaf-2
rdp   27   RDP       # "reliable datagram" protocol
irtp   28   IRTP       # Internet Reliable Transaction Protocol
iso-tp4   29   ISO-TP4       # ISO Transport Protocol Class 4
netblt   30   NETBLT       # Bulk Data Transfer Protocol
mfe-nsp   31   MFE-NSP       # MFE Network Services Protocol
merit-inp   32   MERIT-INP       # MERIT Internodal Protocol
dccp   33   DCCP       # Datagram Congestion Control Protocol
3pc   34   3PC       # Third Party Connect Protocol
idpr   35   IDPR       # Inter-Domain Policy Routing Protocol
xtp   36   XTP       # Xpress Tranfer Protocol
ddp   37   DDP       # Datagram Delivery Protocol
idpr-cmtp   38   IDPR-CMTP       # IDPR Control Message Transport Proto
tp++   39   TP++       # TP++ Transport Protocol
il   40   IL       # IL Transport Protocol
ipv6   41   IPv6       # IPv6 encapsulation
sdrp   42   SDRP       # Source Demand Routing Protocol
ipv6-route   43   IPv6-Route       # Routing Header for IPv6
ipv6-frag   44   IPv6-Frag       # Fragment Header for IPv6
idrp   45   IDRP       # Inter-Domain Routing Protocol
rsvp   46   RSVP       # Resource ReSerVation Protocol
gre   47   GRE       # Generic Routing Encapsulation
dsr   48   DSR       # Dynamic Source Routing Protocol
bna   49   BNA       # BNA
esp   50   ESP       # Encap Security Payload
ipv6-crypt   50   IPv6-Crypt       # Encryption Header for IPv6 (not in official list)
ah   51   AH       # Authentication Header
ipv6-auth   51   IPv6-Auth       # Authentication Header for IPv6 (not in official list)
i-nlsp   52   I-NLSP       # Integrated Net Layer Security TUBA
swipe   53   SWIPE       # IP with Encryption
narp   54   NARP       # NBMA Address Resolution Protocol
mobile   55   MOBILE       # IP Mobility
tlsp   56   TLSP       # Transport Layer Security Protocol
skip   57   SKIP       # SKIP
ipv6-icmp   58   IPv6-ICMP       # ICMP for IPv6
ipv6-nonxt   59   IPv6-NoNxt       # No Next Header for IPv6
ipv6-opts   60   IPv6-Opts       # Destination Options for IPv6
#   61           # any host internal protocol
cftp   62   CFTP       # CFTP
#   63           # any local network
sat-expak   64   SAT-EXPAK       # SATNET and Backroom EXPAK
kryptolan   65   KRYPTOLAN       # Kryptolan
rvd   66   RVD       # MIT Remote Virtual Disk Protocol
ippc   67   IPPC       # Internet Pluribus Packet Core
#   68           # any distributed file system
sat-mon   69   SAT-MON       # SATNET Monitoring
visa   70   VISA       # VISA Protocol
ipcv   71   IPCV       # Internet Packet Core Utility
cpnx   72   CPNX       # Computer Protocol Network Executive
cphb   73   CPHB       # Computer Protocol Heart Beat
wsn   74   WSN       # Wang Span Network
pvp   75   PVP       # Packet Video Protocol
br-sat-mon   76   BR-SAT-MON       # Backroom SATNET Monitoring
sun-nd   77   SUN-ND       # SUN ND PROTOCOL-Temporary
wb-mon   78   WB-MON       # WIDEBAND Monitoring
wb-expak   79   WB-EXPAK       # WIDEBAND EXPAK
iso-ip   80   ISO-IP       # ISO Internet Protocol
vmtp   81   VMTP       # Versatile Message Transport
secure-vmtp   82   SECURE-VMTP       # SECURE-VMTP
vines   83   VINES       # VINES
ttp   84   TTP       # TTP
nsfnet-igp   85   NSFNET-IGP       # NSFNET-IGP
dgp   86   DGP       # Dissimilar Gateway Protocol
tcf   87   TCF       # TCF
eigrp   88   EIGRP       # Enhanced Interior Routing Protocol (Cisco)
ospf   89   OSPFIGP       # Open Shortest Path First IGP
sprite-rpc   90   Sprite-RPC       # Sprite RPC Protocol
larp   91   LARP       # Locus Address Resolution Protocol
mtp   92   MTP       # Multicast Transport Protocol
ax.25   93   AX.25       # AX.25 Frames
ipip   94   IPIP       # Yet Another IP encapsulation
micp   95   MICP       # Mobile Internetworking Control Pro.
scc-sp   96   SCC-SP       # Semaphore Communications Sec. Pro.
etherip   97   ETHERIP       # Ethernet-within-IP Encapsulation
encap   98   ENCAP       # Yet Another IP encapsulation
#   99           # any private encryption scheme
gmtp   100   GMTP       # GMTP
ifmp   101   IFMP       # Ipsilon Flow Management Protocol
pnni   102   PNNI       # PNNI over IP
pim   103   PIM       # Protocol Independent Multicast
aris   104   ARIS       # ARIS
scps   105   SCPS       # SCPS
qnx   106   QNX       # QNX
a/n   107   A/N       # Active Networks
ipcomp   108   IPComp       # IP Payload Compression Protocol
snp   109   SNP       # Sitara Networks Protocol
compaq-peer   110   Compaq-Peer       # Compaq Peer Protocol
ipx-in-ip   111   IPX-in-IP       # IPX in IP
vrrp   112   VRRP       # Virtual Router Redundancy Protocol
pgm   113   PGM       # PGM Reliable Transport Protocol
#   114           # any 0-hop protocol
l2tp   115   L2TP       # Layer Two Tunneling Protocol
ddx   116   DDX       # D-II Data Exchange
iatp   117   IATP       # Interactive Agent Transfer Protocol
stp   118   STP       # Schedule Transfer
srp   119   SRP       # SpectraLink Radio Protocol
uti   120   UTI       # UTI
smp   121   SMP       # Simple Message Protocol
sm   122   SM       # SM
ptp   123   PTP       # Performance Transparency Protocol
isis   124   ISIS       # ISIS over IPv4
fire   125   FIRE
crtp   126   CRTP       # Combat Radio Transport Protocol
crdup   127   CRUDP       # Combat Radio User Datagram
sscopmce   128   SSCOPMCE
iplt   129   IPLT
sps   130   SPS       # Secure Packet Shield
pipe   131   PIPE       # Private IP Encapsulation within IP
sctp   132   SCTP       # Stream Control Transmission Protocol
fc   133   FC       # Fibre Channel
rsvp-e2e-ignore   134   RSVP-E2E-IGNORE
mobility-header   135   Mobility-Header       # Mobility Header
udplite   136   UDPLite
mpls-in-ip   137   MPLS-in-IP
manet   138   manet       # MANET Protocols
hip   139   HIP       # Host Identity Protocol
shim6   140   Shim6       # Shim6 Protocol
wesp   141   WESP       # Wrapped Encapsulating Security Payload
rohc   142   ROHC       # Robust Header Compression
#   143-252 Unassigned                                       [IANA]
#   253     Use for experimentation and testing           [RFC3692]
#   254     Use for experimentation and testing           [RFC3692]
#   255                 Reserved                             [IANA]

NetOsPgm

Sunday, March 1, 2015

socket more detail ( tags: protocol field,

Data transfer ( IN TCP)

Multiple binds to same local port

Broadcasting and determining network configuration

Multiple Sockets

Synchronous or Asynchronous?

1 comment: