The Newsvendor Problem from A Press Distributor Viewpoint

Newsvendor problem is usually considered from the selling point interests. The editor cannot optimize his profits without full information about the distribution at each point. The distributors generally use heuristic methods, based on their market knowledge and of past experiences. The assignment of supply to each point is based on optimization profits or costs, or, alternatively, trough hybrid time series methods. This allows generating a disaggregated estimation on global supply based on the editor policy about the maximum desired proportion of unsold copies and of outlets that run out of stock. This methodology permits an increase of productivity in the staff in charge of the distribution’s decisions can, introducing objective criteria for the assignment of services through a network of selling points. The total run out decided by the editor implies modifications in the decisions of the assignment process, in accordance with the distributor’s business objectives.


Introduction
The distribution of a perishable good, as a newspaper or a magazine, is done through a network of heterogeneous selling points with random demand. Generally, the assignment of services for each establishment is carried out by heuristic methods, based on accumulative experience [1]. The distribution company has to counteract the return cost of samples unsold, with those from the unmet demand due to undersupply. In the marketplace, data of the real demand are not obtained [2], as the agents are not encouraged to provide reliable information, and, therefore, they would rather prefer in general a higher service (supply) level [3]. The patterns of the demand's distributions allows to obtain, in an objective way, the optimal service level, the probability of sufficient supply at each selling point, and the expected lost sales due of running out of stock. With net connections at the selling points, the integration of sales evolution in a distribution database should increase accuracy of prediction techniques. Computer programs, including some of these ideas are been developed in Madrid, which provide press distribution software to several countries in Europe and America, use novel econometric and statistical procedures. Sirius program allows a press distribution policy using objective criteria, together with the heuristic system of distribution tables based on previous experience, and include some probabilistic patterns modelling the demand at each selling point during the period of display (one day, in the case of newspapers, a larger period for magazines). Some models are based on dynamic Poisson [4], negative binomial and other discrete distributions, depending in the average demand. The use of an arithmetic family of distributions allows a general representation of the demand in each of the points in a selling network, both for daily press and for magazines. These models are used in the optimization problem of distribution in a network; also to implement new tools that help to increase significantly the productivity of the staff in charge of the diffusion, and, to establish objective controls in the decisions involved [5]. An open line from which that can produce positive results in a short term is based on approaching simultaneously forecasting global demand, making it compatible with estimating the demand at the level of selling point, with the aim of being able of extending upstairs, that is, towards the editors, prediction services of the demand, and controlling distribution costs. The use of additional statistical tools should allow to tackle some particular events: Bayesian methods for some decision problems, dynamic models of time series, and neuronal networks are among them [6].Some cost/benefit functions are studied related to the diffusion process of daily press and periodical magazine [7], both from the distributor's point of view and from the editor's. These functions depend on two parameters: the unitary cost of returning an unsold copy, Pd, and the implicit cost associated to a sample unsold when a selling point runs out of stock, Pa. The optimization of costs or benefits is useful both from the editor's point of view and from the distributor's [8] treated real and simulated data using edition and distribution costs. Benefit increases with a more restrictive assignment of services can be obtained, taking into account the cost of returned copies and of unmet demand, and, at the same time, reducing the proportion of returned copies, without affecting sales.

Processes of Daily Press and Magazines's Distribution
The print run St, corresponding to t (a day, a week or a month, depending on the frequency of the publication), is usually estimated using dynamic econometric models, with intervention variables to include of extraordinary events (as promotions) that might affect the demand. In a network of k selling points, the proportion of the print run assigned to point j is pjt = sjt/St, so the services assignment or supply to this outlet, is sjt = pjt St, j = 1,2,…,k.. The basic aim of all distribution processes is to minimize the returned unsold copies, as well as the number of points that run out of stock, both at global level and at the individual selling points. These objectives can not be fully met at the same time. An alternative, not totally equivalent, aims to maximize benefits or to minimize distribution costs, merging the two original objectives. Nevertheless, the fact that the local demand is of random nature, prevents an exact estimate. Every day data are available: the sales data vjt, the supplied quantity, sit, and the number of returned copies dit, for all the selling points j = 1,2,…,k; these are stored in the distributor databases. Two additional quality measures of the distribution are the proportions of the returned copies and of the selling points which have run out of stock with kjt = 1, if djt = 0, and kjt = 0 otherwise. Considering a temporal interval t = 1,2,…,T, as the economic fiscal year, these two proportions are related, by an hyperbolic function, at = dt  + ε, that comes from a theoretical efficiency model based on Cobb-Douglas functions. Is used to evaluate the distribution process and for interannual comparisons.
In the economic literature there are numerous references to distribution problem with a random demand, as it is the case of press distribution, at one selling point. Others, questions concerning the estimate of the optimal quantity in the supply of newspapers to a store, are dealt with. The approach SPP, for one day or a predetermined time (for example, one week, in the case of a magazine which has this periodicity) is normalized to multi-period situations [9].
The criterion of maximizing benefits is the most frequent approach, as in Khouja [10]. A usual macro-economic approach is to consider the utility of the selling point or of the client, although, it is little practical as far as its implementation is concerned in an integrated outline, as a distribution software, because of the diversity of utility functions that may have each of the economic agents that mediate in the process.
The inclusion of distributor's and editor's costs and benefits has been dealt with in several papers by Rodríguez, [11] including applications to real data. All these developments correspond to an approach down-up, that starts at the basic unit (the selling point), in which it is assumed that the individual demand is originated by a probabilistic law to be determined, and whose parameters change with time. Statistical methods (generally. maximum likelihood) are used to estimate the distribution of the 'demand at one point'. From this, it is possible to proceed to an optimization of the distribution process, taking into account the aims of the agent which controls the distribution, that is, to minimize costs or to maximize benefits, or to minimize the unmet demand due to undersupply, including costs of returned copies and of undersupply. All interests of each part mediating in the process are included in the estimation of the level of service associated to each point. All distribution process may follow different strategies [14]. The usual procedure is to estimate the global demand, and thus the forecasted sales, Vt, for period t; the print run is obtained applying to it a discount rate d, decided by the editor: the maximum proportion of returned samples allowed. The print run is, then The up-down processes aim to determine the assignment of services sjt = pjtSt, j = 1,2,…,k for each point of the network. Several techniques are available, using the last n sales data previous to the day t, and being m the corresponding delay to the last available data. From this information, as well as the number of copies supplied to the j selling point in the same period, the estimated demand for the period t is obtained It is assumed (and confirmed with real data) that this distribution belongs to a parametric family previously identified (in our case, several arithmetic distributions and its continuous approaches) whose parameter(s), jt θ are estimated from the previous data (1) and (2). The specification of Dj is generally stable for each point. With the predictions of the demand at each selling point for the day or period t and the corresponding distribution of the demand, the estimate of the number of samples, or services, to be assigned for each selling point is obtained: being H the function corresponding to the level of service, which may be implemented in several ways:  by means of a system of heuristic tables based, not on the distribution of Dj, but on its parameters.  Using methods semi-heuristics with a basic model in which sjt depends on its immediate past (for example, the corresponding to the same day of the week in previous weeks), using optionally some models, as techniques of exponential smoothing, and corrections for meaningful events.  Identifying the distribution Dj and estimating a function H with a criterion of optimization, as might be to maximize the expected benefit B(sjt) or minimizing the expected cost C(sjt), or trying to control jointly the proportions of points running out of stock and the number of returned copies, limiting the probabilities of these situations. These individual predictions are ajusted to fit the global print run, in such a way that  Temporal models for sjt according the past data (1) and (2). Some simple techniques of implementing are exponential smoothing with its different variants; it is necessary to use weekly cyclical components. They have the advantage of its simplicity and they do not need a specification of the distributions D ( ) j jt θ , j = 1,2,…,k.  Methods based on the estimate of some parameters jt θ of these distributions (generally the mean and variance), using the data sets of data, with the corresponding filters to avoid weekly cycles. The implemented tables in Credimática's software are an example of these methods, and some variants of these tables of decision simplify the process. Again they have the advantage of its simplicity and they do not need to estimate the distributions of the demand at each point.  Estimates derived from the distributions D ( ) j jt θ , optimizing an objective function (maximization of expected benefits, minimization of expected costs, limits on the probability of returning a prefixed proportion of the service sjt, or of no running out of stock). The first advantage to this approach is the possibility of incorporating economic criteria in the distribution process, instead of the classic estimates from measuring the proportion of returned copies and the selling points running out of stock, and allowing the incorporation of a company policy in relation with the estimate of each sample unsold, with reference to the benefit produced by the selling. These procedures, together with the optimization of costs/benefits, improve the classic criteria of distribution. They allow even to make way to other subsequent developments in order to modify the demand estimates taking into account the print run. Other developments are possible with this method, to increase the productivity of the persons in charge of the diffusion process, as well as some more general approaches of the general problem of diffusion are possible. Obvious generalizations have been omitted in this exposition as they require some theoretical developments within the scope of the theory of stochastic process in order to have in mind the temporal self-correlation among the data (1) that difficult the application of classical statistical techniques in the estimation processes.
In what follows, the development is focused in the behaviour of a selling point, ommitting the subindex j which represents it. The third approach previously presented introduces some improvements that are going to be commented in the process of distribution: maintaining the sales level, the number of returned copies is reduced, and, at the same time, decreasing the number of selling points that run out of stock. Other developments, as the measure of the global efficiency in the distribution process for each temporal period and their comparison, will be brought up further on.

Analysis of Selling Point: Cost Function
The number of copies demanded, on day (or period t), at a network's selling point, j, is a random variable. The knowledge of its statistical distribution is necessary to optimize the number of papers to be supplied. Hereafter, we omit the sub-indexes t and j, as we consider this demand, D = Djt, always referred to t and j. Its distribution is determined by its c.f. FD(x), which is discrete and arithmetic. The corresponding probability function is fD(x); if the demand at the selling point is large it can be approximated by continuous density. The number of copies sold is limited by the level of service (at that date and selling point), s. The cost function, is thus The expected cost corresponding to the level of service s is Depending if the demand is treated as discrete or approximated through a continuous distribution. It is necessary a subjective estimate of the implicit price, Pa, due to an unmet demand of a paper. The cost of returning a copy, Pd, includes its value, delivery and recall costs, and those derived from the subjacent administrative process, minus its residual value. The second term is generally less important as the density of X decreases sharply. The global demand of a newspaper is easier to estimate than the distribution of D at each point, and the aggregation of assignments to the selling points must not differ much from the decided print run. The interest of this analysis of one selling point, considered isolated, is obvious, as a great number of daily decisions must be taken on the number of copies to be distributed in individual points, and the staff in charge will devote much time to resolve hundred of isolated problems that can be automated. The demand distribution is selected within a parametric family; arithmetic distributions of the kind C (a,b,n) wich includes some usual distributions (further on, a Poisson distribution is used). If the demand is large enough, it is possible to use as an adequate approach a continous distribution. In both cases, we are always dealing with parametric families D ( ) j jt θ , j = 1,2,…,k, on each of the points j of a network and for a day or period t. Again, we leave aside these sub-indexes, when dealing with a selling point, considering the distribution of the known demand in its form, except parameter(s) jt  θθ that need to be estimated.
Although a non parametric approach is possible, the parametric modelling techniques to forecast the demand for a s elling point is better from a practical point of view as it is precise enough to improve the technological productivity and personal costs. The estimation of the distribution's parameters is carried out from previous historic data. The aim of the decision process is to forecast the quantity of papers to be supplied to each selling point, through an optimization of some economic criteria and taken into account the company's policy: to minimize the expected additional costs, C(s), associated to inefficiencies in the distribution. or to maximize the expected benefits. In many cases a Poisson distribution is an adequate model to represent the daily demand, and Normal apprixumations can be used for establishments with higher sales. Both distributions are reproductive in relation with their average, and that is why the demand of a magazine over a period corresponding to its display for selling, will be fit to a family of the same family. Nevertheless, actually, sometimes it is possible to dispose of additional information, as for example, an important informative event, a commercial promotion, or a specific demand, what induces to introduce the Bayesian methodology. In summary, the main focus is to give some guidelines to apply optimal assignments of services in each of the points of a network.
There are companies specialized in the logistics of press distribution, each of them providing services to different editors; they also produce market studies and estimates of the print run. The responsibility of the assignment rests upon these companies. The retailers may influence the offer, but, except a few cases (as in big promotions) its influence is small. Frequently, due to economic and even sociologic interests, their perception of the demand may result misleading. The selling points usually pay a fixed canon for distribution rights, independently of the quantity of newspapers or magazines returned, what reduces their incentives to estimate accurately the demand, having the tendency to apply for an excessive number of copies, with the corresponding direct loss for the editor.
Although the function C(s) is defined for integer non negative values, if s is assumed continuous, C is continuous and derivable. Also it presents a minimum that is reached at a point near the one obtained in the discrete case. The marginal cost's function, C'(s), is used to obtain this minimum. The distribution cost increases with the print run, but present economies of scale. Unitary production costs are low compared to the cost of returning a copy. Their increase is less that the sales level at each point, as can be observed in the following graph, where functions of marginal costs have been presented, according to estimates of the parameter of a Poisson demand distribution. Distribution data corresponds to distribution data corresponds to a main daily newspaper, distributed in Spain, during 2007.  It is possible to observe that to find a relative minimum there should be a value t where As was stated by Caridad & Rodríguez [15].
These results are generalized for any usual discrete distribution. Thus, the function of expected costs C(s) has the following proprieties:  At least there exists a relative minimum for s > 0, or there is absolute minimum in C(0).  If the minimum is not unique, the are at sucessive integer values.  Starting from this relative minimum, C(s) is increasing. Therefore, the problem optimization is reduced to the estimate of a relative minimum of C(s).

Analysis of A Selling Point: Profit Function
There are several scenarios to specify profit's functions for an establishment: only costs of returned unsold copies are considered, or costs associated to unmet demand when running out of stock, and finally using both types of costs. Although the methodology used in each case is similar to the one followed in the minimization of costs, the supply, s, is similar but not totally coincident, as the criterion of optimization is different.

Maximization of Benefits Considering the Costs of Returned Copies
The function of benefits for a level of service, s, is being Pi the net income per copy sold. The expected profit is Available online at www.managementjournal.info Penalties associated to unsold copies due to running out of stock are not included. If the average daily sales, μ, is not too low (between 10 a 25 copies), the first term is practically null, for large values of s, and profits might result negative. It is necessary to specify the ratio between the profit from selling a paper and the loss associated to its return, that is ki = Pi/Pd. In some cases it is near to ki = 2, being then The shape of this function is given in the following figure:

Fig. 3: Profit function BD(s) with demand N (20; 3 2 )
To obtain the maximum, the marginal benefit should be null, and a numeric method is used. When considering the second derivative it is observed that it has two zeros, as expected in this family of functions; thus, there exists a unique maximum. The expected profit starts decreasing (it is clear in the case in which Pi > Pd which is the one considered).

Maximization of profits considering both costs
Now the costs associated to returned unsold copies and the implicit costs corresponding to the negative satisfaction of the demand caused when a point of sale runs out of stock, that is, a demand no attended, are taken into account.  Fig. 4: Expected profits, B(

s). Demand Normal N(20, 3 2 )
In this case the maximum is reached when s = 26.
In fact the form of the function is derived from the expressions of its first and second derivatives

A Case Study: A Point of Sale with Average Demand
Sales are not similar to demand, even if they are related as min{ , } V D s  The distribution of the random variable V is obtained from the distribution of the demand, truncating if with the service level, s, supplied to the point of the network. Thus the probability or density function of sales is Demand can not be observed directly, but only trough sales at each point, and with a time lag, due to the delay of m days while returning unsold copies to the distributor.
Using a sample of past values of supply and sales, (si, vi), i = 1,2,…,n, a fit process should be done, to identify the demand distribution. A parametric approach is possible, with the estimation of the parameters jt  θθ associated to sales distribution. For example, with a gaussian demand, this parameter is bivariate and sufficient. Of course experimental design is not without difficulties, as non independence between the observations is clear. One way to overcome this is to consider data separated in time several days. This is feasible for daily newspapers, using data from the same day each week. Even in this case, it is necessary to use stochastic non stationary processes, to take account the autocorrelation.
Once identified, the demand distribution could be used to forecast the supply needed, s, and to estimate lost sales due to undersupply.

Fig. 5: Sales and service for a weekly publication
A case study is presented (figure 5), comparing this methodology of supply forecasting with the usually employed by press distributors based on heuristic assignation tables depending on recent past sales. In each case, a data base containing time series of supply and unsold copies for every period (a day, for a newspaper) is available. For example, with a sample of daily Monday data of a mayor Spanish newspaper, at a point of sale in Madrid´s urban area When the point runs out of stock, real demand is unknown, but can be estimated using the proposed models, and avoiding costly market studies. The service decisions have been made using the classical heuristic tables, and in our modelling, a Normal demand distribution is employed. Finally, forecasting services, optimising profits with 2 id PP  : Using returned copies costs: s  50.
Using out of stock costs, with the maximun mean expected profits for the distributor, s = 35.95 = 36 Using both types of costs: s  50, 51 Lost sales due to undersupply are estimated in the following table Taking into account both types of costs (for undersupply and for returned copies), the assignation process based on the maximization of profits, leads to lost sales of 0.20% for this point, while if the distributor wants to optimize the expected income, this would increase to 7.19%. Using s = 41. based on costs of under and oversupply, losses would reach 2.67%. These ratios are based upon total real sales,and using total potential sales, they should be 0.2%. 6.8% and 2.6%. The real distribution applied to this point, using the heuristic tables method is provided in the following table.

Conflicting Interests between Different Agents and A Compromise Solution
An editor can maximize his expected profits not only acting upon his production costs, but watching carefully the distribution process. The terms in the contracts with the distributors are the principal factor that could lead to benefits or losses. Beside this, the editor would want to increase its sales at the maximum level possible. At a point of sales with a service level s, and with d papers unsold, cost and profit functions (fixed costs are not considred, as they do not influence short term distribution) are the following.
The unitary prices or costs are: Pf E, edition or production cost, Pd E unsold copy value, returned to the editor, Pa E implicit cost for the editor of a copy demanded and not supplied due to an underestimated demand. Unitary profits are: Pi E , editor's income for a sold copy, Pd E ,estimated loss due to an unsold copy, , Pd D cost for a returned copy beared by the distributor, Pa D estimated distributor cost for an unsold copy due to underestimation of the demand, Pi E . The distributor collects the sales proceedings and its commissions, as a fixed amount per point of sales Its cots and profit functions are The expected cost and benefit functions are C(s) and B(s).
In the short term, with perfect information, all agents involved would try to optimize their own profits, but maximizing sales provide different utility (and returns) for each of them. This situation could lead to inefficiencies in the distribution process. A point of sale desire is to maximize the number of papers sold, and bears a null marginal cost, thus, its trend would be to ask for a larger supply of papers; larger that the number supplied by the distributor who is limited by the additional costs of distribution and edition.
The agents involved should maximize their respective profits functions, reaching an equilibrium point in the short term. Conflicts of interest are natural, as the editor tends to supply a smaller number of papers that the distributor (and the points of sales) would like to. To avoid losses, the editor needs information about the number of copies supplied to each point, and the corresponding sales. The data at each point are available only trough the distributor. A close collaboration between the two is necessary to optimize the distribution process as a whole.For example, a point of sale with Poisson type demand, has the following expected income and costs  Deviations from the optimum can lead to losses for the editor, who is conditioned by the distributor's decisions. Thus, if the editor is not able to identify the demand distribution at each point of sale, his profits will depend on the distributor's decisions. Controlling an upper limit for the supply is not the main interest of the distributor. Its marginal expected profit is   The point of sale has no influence in pricing, and its desired supply should not be under the point which maximizes its marginal profit. The distributor has the same information to decide, but he will use the aggregated expected profits(overallthe network):  The value of the information is a central point in the contractual relations between editors and distributors, and could be solved by an ownership control of the distribution firms by the editors, or, alternatively, by agreements where the flow of disaggregated information about supply and sales, should be provided by the distributors. In Caridad and Rodriguez (2004), some generalizations are considered, as well as spatial and temporal problems.

Some Concluding Remarks
Press distribution is a well studied problem, usually from the point of view of the salesman. But others economics agents take part into it, with partially conflicting goals, depending on their commercial agreements and the legal restrictions. In Spain, editors are not usually in charge of the distribution, which is outsorced trough distributors. These operate with full information, but can only observe sales at each point of a network, but not the real demand. Deciding about the number of copies that should be sended to each sales point is conditioned by several factors, usually decided by the distributor, but affecting the editors profit and loss account. Modelling the demand is the only alternative to be able to optimize the whole process, and this should be done at each point of a network. If the editor does not have access to this information, the optimization will be at his own expense. One could determine the value of complete information versus a partial one. Many statistical problems arouse in the identification of the demand distribution at a point of the network. Some have been addressed here, and some others are been studied. For example, stational aspects of sales, temporal autocorrelation in micro data, spatial autocorrelation between nearby points, introduction of restrictions derived from business policies, and so on.Alternative approaches could be data driven, with obvious computational advantages (witch should not be overlooked); even time series models for sales are easy to implement, and the results are quite satisfactory. More complex models, like hybrid time seriesneural networks are even more promising, but for the computational burden, and the need of long series of data, which is not the case in monthly, or even weekly, publications.
This leads to the present development, that can be easily implemented, with appropriate software and that allows to include editor and distributors policies, adjust the distribution process trough all the network selling points according to the desired print run, and to optimize the economic objectives. It is been implemented in commercial software, and successfully used in several countries.